jellyfin: restartTriggers on webhook plugin so install runs at activation

The jellyfin-webhook-install oneshot has 'wantedBy = jellyfin.service', which only runs it when jellyfin (re)starts. On first rollout to a host where jellyfin is already running, the unit gets added but never fires, leaving the Webhook plugin files absent -- jellyfin-webhook-configure then gets 404 from /Plugins/$GUID/Configuration and deploy-rs rolls back. Pinning jellyfin.restartTriggers to the plugin package + install script forces a restart whenever either derivation changes, which pulls install in via the existing before/wantedBy chain.
jellyfin-qbittorrent-monitor: add webhook receiver for instant throttling
2026-04-17 22:08:29 -04:00 · 2026-04-17 19:47:29 -04:00 · 2026-04-17 19:47:26 -04:00 · 2026-04-17 19:47:23 -04:00 · 2026-04-17 19:47:20 -04:00 · 2026-04-17 00:53:24 -04:00
99 changed files with 4859 additions and 509 deletions
--- a/.gitea/workflows/deploy.yml
+++ b/.gitea/workflows/deploy.yml
@@ -0,0 +1,60 @@
+name: Build and Deploy
+on:
+  push:
+    branches: [main]
+
+jobs:
+  deploy:
+    runs-on: nix
+    env:
+      GIT_SSH_COMMAND: "ssh -i /run/agenix/ci-deploy-key -o StrictHostKeyChecking=yes -o UserKnownHostsFile=/etc/ci-known-hosts"
+    steps:
+      - uses: https://github.com/actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Unlock git-crypt
+        run: |
+          git-crypt unlock /run/agenix/git-crypt-key-server-config
+
+      - name: Build NixOS configuration
+        run: |
+          nix build .#nixosConfigurations.muffin.config.system.build.toplevel -L
+
+      - name: Deploy via deploy-rs
+        run: |
+          eval $(ssh-agent -s)
+          ssh-add /run/agenix/ci-deploy-key
+          nix run github:serokell/deploy-rs -- .#muffin --skip-checks --ssh-opts="-o StrictHostKeyChecking=yes -o UserKnownHostsFile=/etc/ci-known-hosts"
+
+      - name: Health check
+        run: |
+          sleep 10
+          ssh -i /run/agenix/ci-deploy-key -o StrictHostKeyChecking=yes -o UserKnownHostsFile=/etc/ci-known-hosts root@server-public \
+            "systemctl is-active gitea && systemctl is-active caddy && systemctl is-active continuwuity && systemctl is-active coturn"
+
+      - name: Notify success
+        if: success()
+        run: |
+          TOPIC=$(cat /run/agenix/ntfy-alerts-topic | tr -d '[:space:]')
+          TOKEN=$(cat /run/agenix/ntfy-alerts-token | tr -d '[:space:]')
+          curl -sf -o /dev/null -X POST \
+            "https://ntfy.sigkill.computer/$TOPIC" \
+            -H "Authorization: Bearer $TOKEN" \
+            -H "Title: [muffin] Deploy succeeded" \
+            -H "Priority: default" \
+            -H "Tags: white_check_mark" \
+            -d "server-config deployed from commit ${GITHUB_SHA::8}"
+
+      - name: Notify failure
+        if: failure()
+        run: |
+          TOPIC=$(cat /run/agenix/ntfy-alerts-topic 2>/dev/null | tr -d '[:space:]')
+          TOKEN=$(cat /run/agenix/ntfy-alerts-token 2>/dev/null | tr -d '[:space:]')
+          curl -sf -o /dev/null -X POST \
+            "https://ntfy.sigkill.computer/$TOPIC" \
+            -H "Authorization: Bearer $TOKEN" \
+            -H "Title: [muffin] Deploy FAILED" \
+            -H "Priority: urgent" \
+            -H "Tags: rotating_light" \
+            -d "server-config deploy failed at commit ${GITHUB_SHA::8}" || true
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -99,7 +99,11 @@ Each service file in `services/` follows this structure:
 - **git-crypt**: `secrets/` directory and `usb-secrets/usb-secrets-key*` are encrypted (see `.gitattributes`)
 - **agenix**: secrets declared in `modules/age-secrets.nix`, decrypted at runtime to `/run/agenix/`
 - **Identity**: USB drive at `/mnt/usb-secrets/usb-secrets-key`
- **Encrypting new secrets**: The agenix encryption key is in `usb-secrets/usb-secrets-key` (SSH private key, git-crypt encrypted). To create a new secret: derive the age public key with `ssh-keygen -y -f usb-secrets/usb-secrets-key | ssh-to-age`, then encrypt with `age -r <public-key> -o secrets/<name>.age`.
+- **Encrypting new secrets**: The agenix identity is an SSH private key at `usb-secrets/usb-secrets-key` (git-crypt encrypted). To encrypt a new secret, use the SSH public key directly with `age -R`:
+  ```bash
+  age -R <(ssh-keygen -y -f usb-secrets/usb-secrets-key) -o secrets/<name>.age /path/to/plaintext
+  ```
+- **DO NOT use `ssh-to-age`**. Using `ssh-to-age` to derive a native age public key and then encrypting with `age -r age1...` produces `X25519` recipient stanzas. The SSH private key identity on the server can only decrypt `ssh-ed25519` stanzas. This mismatch causes `age: error: no identity matched any of the recipients` at deploy time. Always use `age -R` with the SSH public key directly.
 - Never read or commit plaintext secrets. Never log secret values.

 ### Important Patterns
@@ -108,6 +112,7 @@ Each service file in `services/` follows this structure:
 - **Hugepages**: Services needing large pages declare their budget in `service-configs.nix` under `hugepages_2m.services`. The kernel sysctl is set automatically from the total.
 - **Domain**: Primary domain is `sigkill.computer`. Old domain `gardling.com` redirects automatically.
 - **Hardened kernel**: Uses `_hardened` kernel. Security-sensitive defaults apply.
+- **PostgreSQL as central database**: All services that support PostgreSQL MUST use it instead of embedded databases (H2, SQLite, etc.). Connect via Unix socket with peer auth when possible (JDBC services can use junixsocket). The PostgreSQL instance is declared in `services/postgresql.nix` with ZFS-backed storage. Use `ensureDatabases`/`ensureUsers` to auto-create databases and roles.

 ### Test Pattern
 Tests use `pkgs.testers.runNixOSTest` (NixOS VM tests):
--- a/configuration.nix
+++ b/configuration.nix
@@ -20,17 +20,18 @@
    ./modules/no-rgb.nix
    ./modules/security.nix
    ./modules/ntfy-alerts.nix
+    ./modules/power.nix

    ./services/postgresql.nix
-    ./services/jellyfin.nix
-    ./services/caddy.nix
+    ./services/jellyfin
+    ./services/caddy
    ./services/immich.nix
    ./services/gitea.nix
+    ./services/gitea-actions-runner.nix
    ./services/minecraft.nix

    ./services/wg.nix
    ./services/qbittorrent.nix
-    ./services/jellyfin-qbittorrent-monitor.nix
    ./services/bitmagnet.nix

    ./services/arr/prowlarr.nix
@@ -45,21 +46,19 @@

    ./services/soulseek.nix

+    # ./services/llama-cpp.nix
+    ./services/trilium.nix
+
    ./services/ups.nix

+    ./services/grafana
+
    ./services/bitwarden.nix
    ./services/firefox-syncserver.nix

-    ./services/matrix.nix
-    ./services/coturn.nix
-    ./services/livekit.nix
+    ./services/matrix

-    ./services/monero.nix
-    ./services/p2pool.nix
-    ./services/xmrig.nix
-
-    # KEEP UNTIL 2028
-    ./services/caddy_senior_project.nix
+    ./services/monero

    ./services/graphing-calculator.nix

@@ -67,20 +66,28 @@

    ./services/syncthing.nix

-    ./services/ntfy.nix
-    ./services/ntfy-alerts.nix
+    ./services/ntfy

    ./services/mollysocket.nix
+
+    ./services/harmonia.nix
+
+    ./services/ddns-updater.nix
  ];

-  services.kmscon.enable = true;
+  # Hosts entries for CI/CD deploy targets
+  networking.hosts."192.168.1.50" = [ "server-public" ];
+  networking.hosts."192.168.1.223" = [ "desktop" ];

-  systemd.targets = {
-    sleep.enable = false;
-    suspend.enable = false;
-    hibernate.enable = false;
-    hybrid-sleep.enable = false;
-  };
+  # SSH known_hosts for CI runner (pinned host keys)
+  environment.etc."ci-known-hosts".text = ''
+    server-public ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFMjgaMnE+zS7tL+m5E7gh9Q9U1zurLdmU0qcmEmaucu
+    192.168.1.50 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFMjgaMnE+zS7tL+m5E7gh9Q9U1zurLdmU0qcmEmaucu
+    git.sigkill.computer ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFMjgaMnE+zS7tL+m5E7gh9Q9U1zurLdmU0qcmEmaucu
+    git.gardling.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFMjgaMnE+zS7tL+m5E7gh9Q9U1zurLdmU0qcmEmaucu
+  '';
+
+  services.kmscon.enable = true;

  # Disable serial getty on ttyS0 to prevent dmesg warnings
  systemd.services."serial-getty@ttyS0".enable = false;
@@ -93,12 +100,6 @@
    enable = false;
  };

-  powerManagement = {
-    powertop.enable = true;
-    enable = true;
-    cpuFreqGovernor = "powersave";
-  };
-
  # https://github.com/NixOS/nixpkgs/issues/101459#issuecomment-758306434
  security.pam.loginLimits = [
    {
@@ -121,14 +122,21 @@
    };
  };

-  hardware.intelgpu.driver = "xe";
+  # Intel Arc A380 (DG2, 56a5) uses the i915 driver on kernel 6.12.
+  # The xe driver's iHD media driver integration has buffer mapping
+  # failures on this GPU/kernel combination. i915 works correctly for
+  # VAAPI transcode as long as ASPM deep states are disabled for the
+  # GPU (see modules/power.nix).
+  hardware.intelgpu.driver = "i915";

  # Per-service 2MB hugepage budget calculated in service-configs.nix.
  boot.kernel.sysctl."vm.nr_hugepages" = service_configs.hugepages_2m.total_pages;

  boot = {
-    # 6.12 LTS until 2026
-    kernelPackages = pkgs.linuxPackages_6_12_hardened;
+    # 6.12 LTS until 2027-03. Kernel 6.18 causes a reproducible ZFS deadlock
+    # in dbuf_evict due to page allocator changes (__free_frozen_pages).
+    # https://github.com/openzfs/zfs/issues/18426
+    kernelPackages = pkgs.linuxPackages_6_12;

    loader = {
      # Use the systemd-boot EFI boot loader.
@@ -249,6 +257,14 @@

  users.groups.${service_configs.media_group} = { };

+  users.users.gitea-runner = {
+    isSystemUser = true;
+    group = "gitea-runner";
+    home = "/var/lib/gitea-runner";
+    description = "Gitea Actions CI runner";
+  };
+  users.groups.gitea-runner = { };
+
  users.users.${username} = {
    isNormalUser = true;
    extraGroups = [
@@ -290,7 +306,8 @@
    enable = true;
    openFirewall = true;
    welcometext = "meow meow meow meow meow :3 xd";
-    password = builtins.readFile ./secrets/murmur_password;
+    password = "$MURMURD_PASSWORD";
+    environmentFile = config.age.secrets.murmur-password-env.path;
    port = service_configs.ports.public.murmur.port;
  };

--- a/flake.lock
+++ b/flake.lock
@@ -27,16 +27,17 @@
    },
    "arr-init": {
      "inputs": {
+        "flake-utils": "flake-utils",
        "nixpkgs": [
          "nixpkgs"
        ]
      },
      "locked": {
-        "lastModified": 1774681523,
-        "narHash": "sha256-K49RohIwbgzVeOdStfVDO83qy5K5ZLKWk4EsHJKj/k4=",
+        "lastModified": 1776401121,
+        "narHash": "sha256-BELV1YMBuLL0aQNQ3SLvSLq8YN5h2o1jcrwz1+Zt32Q=",
        "ref": "refs/heads/main",
-        "rev": "f8475f6cb4d4d4df99002d07cf9583fb33b87876",
-        "revCount": 11,
+        "rev": "6dde2a3e0d087208b8084b61113707c5533c4c2d",
+        "revCount": 19,
        "type": "git",
        "url": "ssh://gitea@git.gardling.com/titaniumtown/arr-init"
      },
@@ -102,6 +103,29 @@
        "type": "github"
      }
    },
+    "fenix": {
+      "inputs": {
+        "nixpkgs": [
+          "qbittorrent-metrics-exporter",
+          "naersk",
+          "nixpkgs"
+        ],
+        "rust-analyzer-src": "rust-analyzer-src"
+      },
+      "locked": {
+        "lastModified": 1752475459,
+        "narHash": "sha256-z6QEu4ZFuHiqdOPbYss4/Q8B0BFhacR8ts6jO/F/aOU=",
+        "owner": "nix-community",
+        "repo": "fenix",
+        "rev": "bf0d6f70f4c9a9cf8845f992105652173f4b617f",
+        "type": "github"
+      },
+      "original": {
+        "owner": "nix-community",
+        "repo": "fenix",
+        "type": "github"
+      }
+    },
    "flake-compat": {
      "flake": false,
      "locked": {
@@ -150,9 +174,45 @@
        "type": "github"
      }
    },
+    "flake-parts": {
+      "inputs": {
+        "nixpkgs-lib": "nixpkgs-lib"
+      },
+      "locked": {
+        "lastModified": 1730504689,
+        "narHash": "sha256-hgmguH29K2fvs9szpq2r3pz2/8cJd2LPS+b4tfNFCwE=",
+        "owner": "hercules-ci",
+        "repo": "flake-parts",
+        "rev": "506278e768c2a08bec68eb62932193e341f55c90",
+        "type": "github"
+      },
+      "original": {
+        "owner": "hercules-ci",
+        "repo": "flake-parts",
+        "type": "github"
+      }
+    },
    "flake-utils": {
      "inputs": {
-        "systems": "systems_4"
+        "systems": "systems_2"
+      },
+      "locked": {
+        "lastModified": 1731533236,
+        "narHash": "sha256-l0KFg5HjrsfsO/JpG+r7fRrqm12kzFHyUHqHCVpMMbI=",
+        "owner": "numtide",
+        "repo": "flake-utils",
+        "rev": "11707dc2f618dd54ca8739b309ec4fc024de578b",
+        "type": "github"
+      },
+      "original": {
+        "owner": "numtide",
+        "repo": "flake-utils",
+        "type": "github"
+      }
+    },
+    "flake-utils_2": {
+      "inputs": {
+        "systems": "systems_6"
      },
      "locked": {
        "lastModified": 1731533236,
@@ -197,11 +257,11 @@
        ]
      },
      "locked": {
-        "lastModified": 1774875830,
-        "narHash": "sha256-WPYlTmZvVa9dWlAziFkVjBdv1Z6giNIq40O1DxsBmiI=",
+        "lastModified": 1775425411,
+        "narHash": "sha256-KY6HsebJHEe5nHOWP7ur09mb0drGxYSzE3rQxy62rJo=",
        "owner": "nix-community",
        "repo": "home-manager",
-        "rev": "7afd8cebb99e25a64a745765920e663478eb8830",
+        "rev": "0d02ec1d0a05f88ef9e74b516842900c41f0f2fe",
        "type": "github"
      },
      "original": {
@@ -263,11 +323,11 @@
        "rust-overlay": "rust-overlay"
      },
      "locked": {
-        "lastModified": 1774858933,
-        "narHash": "sha256-rgHUoE4QhOvK3Rcl9cbuIVdjPjFjfhcTm/uPs8Y7+2w=",
+        "lastModified": 1776248416,
+        "narHash": "sha256-TC6yzbCAex1pDfqUZv9u8fVm8e17ft5fNrcZ0JRDOIQ=",
        "owner": "nix-community",
        "repo": "lanzaboote",
-        "rev": "45338aab3013924c75305f5cb3543b9cda993183",
+        "rev": "18e9e64bae15b828c092658335599122a6db939b",
        "type": "github"
      },
      "original": {
@@ -276,20 +336,62 @@
        "type": "github"
      }
    },
+    "llamacpp": {
+      "inputs": {
+        "flake-parts": "flake-parts",
+        "nixpkgs": [
+          "nixpkgs"
+        ]
+      },
+      "locked": {
+        "lastModified": 1776301820,
+        "narHash": "sha256-Yr3JRZ05PNmX4sR2Ak7e0jT+oCQgTAAML7FUoyTmitk=",
+        "owner": "TheTom",
+        "repo": "llama-cpp-turboquant",
+        "rev": "1073622985bb68075472474b4b0fdfcdabcfc9d0",
+        "type": "github"
+      },
+      "original": {
+        "owner": "TheTom",
+        "ref": "feature/turboquant-kv-cache",
+        "repo": "llama-cpp-turboquant",
+        "type": "github"
+      }
+    },
+    "naersk": {
+      "inputs": {
+        "fenix": "fenix",
+        "nixpkgs": "nixpkgs_2"
+      },
+      "locked": {
+        "lastModified": 1763384566,
+        "narHash": "sha256-r+wgI+WvNaSdxQmqaM58lVNvJYJ16zoq+tKN20cLst4=",
+        "owner": "nix-community",
+        "repo": "naersk",
+        "rev": "d4155d6ebb70fbe2314959842f744aa7cabbbf6a",
+        "type": "github"
+      },
+      "original": {
+        "owner": "nix-community",
+        "ref": "master",
+        "repo": "naersk",
+        "type": "github"
+      }
+    },
    "nix-minecraft": {
      "inputs": {
        "flake-compat": "flake-compat_3",
        "nixpkgs": [
          "nixpkgs"
        ],
-        "systems": "systems_3"
+        "systems": "systems_4"
      },
      "locked": {
-        "lastModified": 1774896109,
-        "narHash": "sha256-0ue9vbpiLP5ZHZd5e7eQAplM9Rb4tunV+u5xcJDJ+lc=",
+        "lastModified": 1776310483,
+        "narHash": "sha256-xMFl+umxGmo5VEgcZcXT5Dk9sXU5WyTRz1Olpywr/60=",
        "owner": "Infinidoge",
        "repo": "nix-minecraft",
-        "rev": "cb646a9e33cfa6b7d5facd9b7d4ca738ad1ff953",
+        "rev": "74abd91054e2655d6c392428a27e5d27edd5e6bf",
        "type": "github"
      },
      "original": {
@@ -300,11 +402,11 @@
    },
    "nixos-hardware": {
      "locked": {
-        "lastModified": 1774777275,
-        "narHash": "sha256-qogBiYFq8hZusDPeeKRqzelBAhZvREc7Cl+qlewGUCg=",
+        "lastModified": 1775490113,
+        "narHash": "sha256-2ZBhDNZZwYkRmefK5XLOusCJHnoeKkoN95hoSGgMxWM=",
        "owner": "NixOS",
        "repo": "nixos-hardware",
-        "rev": "b8f81636927f1af0cca812d22c876bad0a883ccd",
+        "rev": "c775c2772ba56e906cbeb4e0b2db19079ef11ff7",
        "type": "github"
      },
      "original": {
@@ -316,11 +418,11 @@
    },
    "nixpkgs": {
      "locked": {
-        "lastModified": 1774388614,
-        "narHash": "sha256-tFwzTI0DdDzovdE9+Ras6CUss0yn8P9XV4Ja6RjA+nU=",
+        "lastModified": 1776221942,
+        "narHash": "sha256-FbQAeVNi7G4v3QCSThrSAAvzQTmrmyDLiHNPvTF2qFM=",
        "owner": "NixOS",
        "repo": "nixpkgs",
-        "rev": "1073dad219cb244572b74da2b20c7fe39cb3fa9e",
+        "rev": "1766437c5509f444c1b15331e82b8b6a9b967000",
        "type": "github"
      },
      "original": {
@@ -330,6 +432,18 @@
        "type": "github"
      }
    },
+    "nixpkgs-lib": {
+      "locked": {
+        "lastModified": 1730504152,
+        "narHash": "sha256-lXvH/vOfb4aGYyvFmZK/HlsNsr/0CVWlwYvo2rxJk3s=",
+        "type": "tarball",
+        "url": "https://github.com/NixOS/nixpkgs/archive/cc2f28000298e1269cea6612cd06ec9979dd5d7f.tar.gz"
+      },
+      "original": {
+        "type": "tarball",
+        "url": "https://github.com/NixOS/nixpkgs/archive/cc2f28000298e1269cea6612cd06ec9979dd5d7f.tar.gz"
+      }
+    },
    "nixpkgs-p2pool-module": {
      "flake": false,
      "locked": {
@@ -348,6 +462,22 @@
      }
    },
    "nixpkgs_2": {
+      "locked": {
+        "lastModified": 1752077645,
+        "narHash": "sha256-HM791ZQtXV93xtCY+ZxG1REzhQenSQO020cu6rHtAPk=",
+        "owner": "NixOS",
+        "repo": "nixpkgs",
+        "rev": "be9e214982e20b8310878ac2baa063a961c1bdf6",
+        "type": "github"
+      },
+      "original": {
+        "owner": "NixOS",
+        "ref": "nixpkgs-unstable",
+        "repo": "nixpkgs",
+        "type": "github"
+      }
+    },
+    "nixpkgs_3": {
      "locked": {
        "lastModified": 1764517877,
        "narHash": "sha256-pp3uT4hHijIC8JUK5MEqeAWmParJrgBVzHLNfJDZxg4=",
@@ -386,6 +516,28 @@
        "type": "github"
      }
    },
+    "qbittorrent-metrics-exporter": {
+      "inputs": {
+        "naersk": "naersk",
+        "nixpkgs": [
+          "nixpkgs"
+        ],
+        "systems": "systems_5"
+      },
+      "locked": {
+        "lastModified": 1771989937,
+        "narHash": "sha256-bPUV4gVvSbF4VMkbLKYrfwVwzTeS+Sr41wucDj1///g=",
+        "ref": "refs/heads/main",
+        "rev": "cb94f866b7a2738532b1cae31d0b9f89adecbd54",
+        "revCount": 112,
+        "type": "git",
+        "url": "https://codeberg.org/anriha/qbittorrent-metrics-exporter"
+      },
+      "original": {
+        "type": "git",
+        "url": "https://codeberg.org/anriha/qbittorrent-metrics-exporter"
+      }
+    },
    "root": {
      "inputs": {
        "agenix": "agenix",
@@ -395,10 +547,12 @@
        "home-manager": "home-manager",
        "impermanence": "impermanence",
        "lanzaboote": "lanzaboote",
+        "llamacpp": "llamacpp",
        "nix-minecraft": "nix-minecraft",
        "nixos-hardware": "nixos-hardware",
        "nixpkgs": "nixpkgs",
        "nixpkgs-p2pool-module": "nixpkgs-p2pool-module",
+        "qbittorrent-metrics-exporter": "qbittorrent-metrics-exporter",
        "senior_project-website": "senior_project-website",
        "srvos": "srvos",
        "trackerlist": "trackerlist",
@@ -407,6 +561,23 @@
        "ytbn-graphing-software": "ytbn-graphing-software"
      }
    },
+    "rust-analyzer-src": {
+      "flake": false,
+      "locked": {
+        "lastModified": 1752428706,
+        "narHash": "sha256-EJcdxw3aXfP8Ex1Nm3s0awyH9egQvB2Gu+QEnJn2Sfg=",
+        "owner": "rust-lang",
+        "repo": "rust-analyzer",
+        "rev": "591e3b7624be97e4443ea7b5542c191311aa141d",
+        "type": "github"
+      },
+      "original": {
+        "owner": "rust-lang",
+        "ref": "nightly",
+        "repo": "rust-analyzer",
+        "type": "github"
+      }
+    },
    "rust-overlay": {
      "inputs": {
        "nixpkgs": [
@@ -452,11 +623,11 @@
    "senior_project-website": {
      "flake": false,
      "locked": {
-        "lastModified": 1771869552,
-        "narHash": "sha256-veaVrRWCSy7HYAAjUFLw8HASKcj+3f0W+sCwS3QiaM4=",
+        "lastModified": 1775019649,
+        "narHash": "sha256-zVQy5ydiWKnIixf79pmd2LJTPkwyiv4V5piKZETDdwI=",
        "owner": "Titaniumtown",
        "repo": "senior-project-website",
-        "rev": "28a2b93492dac877dce0b38f078eacf74fce26e7",
+        "rev": "bfd504c77c90524b167158652e1d87a260680120",
        "type": "github"
      },
      "original": {
@@ -472,11 +643,11 @@
        ]
      },
      "locked": {
-        "lastModified": 1774517972,
-        "narHash": "sha256-oPIVzGlMmfWuJlRbr87yU3cnV8NxtwTG92GqpQczlkw=",
+        "lastModified": 1776306894,
+        "narHash": "sha256-l4N3O1cfXiQCHJGspAkg6WlZyOFBTbLXhi8Anf8jB0g=",
        "owner": "nix-community",
        "repo": "srvos",
-        "rev": "0ddba2fbd72bb60f8b35b7de1ad67590f454d402",
+        "rev": "01d98209264c78cb323b636d7ab3fe8e7a8b60c7",
        "type": "github"
      },
      "original": {
@@ -545,14 +716,44 @@
        "type": "github"
      }
    },
+    "systems_5": {
+      "locked": {
+        "lastModified": 1681028828,
+        "narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
+        "owner": "nix-systems",
+        "repo": "default",
+        "rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
+        "type": "github"
+      },
+      "original": {
+        "owner": "nix-systems",
+        "repo": "default",
+        "type": "github"
+      }
+    },
+    "systems_6": {
+      "locked": {
+        "lastModified": 1681028828,
+        "narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
+        "owner": "nix-systems",
+        "repo": "default",
+        "rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
+        "type": "github"
+      },
+      "original": {
+        "owner": "nix-systems",
+        "repo": "default",
+        "type": "github"
+      }
+    },
    "trackerlist": {
      "flake": false,
      "locked": {
-        "lastModified": 1774822185,
-        "narHash": "sha256-vEVjI/PWBfxLd1dzlo1MUSqeH5J33OLU7a5dfuzOKu4=",
+        "lastModified": 1776290985,
+        "narHash": "sha256-eNWDOLBA0vk1TiKqse71siIAgLycjvBFDw35eAtnUPs=",
        "owner": "ngosang",
        "repo": "trackerslist",
-        "rev": "c77ccbef4be871a37dcbd3465d62bb8b5c7bc025",
+        "rev": "9bb380b3c2a641a3289f92dedef97016f2e47f36",
        "type": "github"
      },
      "original": {
@@ -563,7 +764,7 @@
    },
    "utils": {
      "inputs": {
-        "systems": "systems_2"
+        "systems": "systems_3"
      },
      "locked": {
        "lastModified": 1731533236,
@@ -612,8 +813,8 @@
    },
    "ytbn-graphing-software": {
      "inputs": {
-        "flake-utils": "flake-utils",
-        "nixpkgs": "nixpkgs_2",
+        "flake-utils": "flake-utils_2",
+        "nixpkgs": "nixpkgs_3",
        "rust-overlay": "rust-overlay_2"
      },
      "locked": {
--- a/flake.nix
+++ b/flake.nix
@@ -28,6 +28,11 @@
      inputs.nixpkgs.follows = "nixpkgs";
    };

+    llamacpp = {
+      url = "github:TheTom/llama-cpp-turboquant/feature/turboquant-kv-cache";
+      inputs.nixpkgs.follows = "nixpkgs";
+    };
+
    srvos = {
      url = "github:nix-community/srvos";
      inputs.nixpkgs.follows = "nixpkgs";
@@ -78,6 +83,11 @@
      url = "github:JacoMalan1/nixpkgs/create-p2pool-service";
      flake = false;
    };
+
+    qbittorrent-metrics-exporter = {
+      url = "git+https://codeberg.org/anriha/qbittorrent-metrics-exporter";
+      inputs.nixpkgs.follows = "nixpkgs";
+    };
  };

  outputs =
@@ -113,7 +123,7 @@
        name = "nixpkgs-patched";
        src = nixpkgs;
        patches = [
-          ./patches/0001-firefox-syncserver-add-postgresql-backend-support.patch
+          ./patches/nixpkgs/0001-firefox-syncserver-add-postgresql-backend-support.patch
        ];
      };

--- a/modules/age-secrets.nix
+++ b/modules/age-secrets.nix
@@ -46,6 +46,22 @@
      group = "caddy";
    };

+    # Njalla API token (NJALLA_API_TOKEN=...) for Caddy DNS-01 challenge
+    njalla-api-token-env = {
+      file = ../secrets/njalla-api-token-env.age;
+      mode = "0400";
+      owner = "caddy";
+      group = "caddy";
+    };
+
+    # ddns-updater config.json with Njalla provider credentials
+    ddns-updater-config = {
+      file = ../secrets/ddns-updater-config.age;
+      mode = "0400";
+      owner = "ddns-updater";
+      group = "ddns-updater";
+    };
+
    jellyfin-api-key = {
      file = ../secrets/jellyfin-api-key.age;
      mode = "0400";
@@ -68,19 +84,19 @@
      group = "root";
    };

-    # ntfy-alerts secrets
+    # ntfy-alerts secrets (group-readable for CI runner notifications)
    ntfy-alerts-topic = {
      file = ../secrets/ntfy-alerts-topic.age;
-      mode = "0400";
+      mode = "0440";
      owner = "root";
-      group = "root";
+      group = "gitea-runner";
    };

    ntfy-alerts-token = {
      file = ../secrets/ntfy-alerts-token.age;
-      mode = "0400";
+      mode = "0440";
      owner = "root";
-      group = "root";
+      group = "gitea-runner";
    };

    # Firefox Sync server secrets (SYNC_MASTER_SECRET)
@@ -94,5 +110,94 @@
      file = ../secrets/mollysocket-env.age;
      mode = "0400";
    };
+
+    # Murmur (Mumble) server password
+    murmur-password-env = {
+      file = ../secrets/murmur-password-env.age;
+      mode = "0400";
+      owner = "murmur";
+      group = "murmur";
+    };
+
+    # Coturn static auth secret
+    coturn-auth-secret = {
+      file = ../secrets/coturn-auth-secret.age;
+      mode = "0400";
+      owner = "turnserver";
+      group = "turnserver";
+    };
+
+    # Matrix (continuwuity) registration token
+    matrix-reg-token = {
+      file = ../secrets/matrix-reg-token.age;
+      mode = "0400";
+      owner = "continuwuity";
+      group = "continuwuity";
+    };
+
+    # Matrix (continuwuity) TURN secret — same secret as coturn-auth-secret,
+    # decrypted separately so continuwuity can read it with its own ownership
+    matrix-turn-secret = {
+      file = ../secrets/coturn-auth-secret.age;
+      mode = "0400";
+      owner = "continuwuity";
+      group = "continuwuity";
+    };
+
+    # CI deploy SSH key
+    ci-deploy-key = {
+      file = ../secrets/ci-deploy-key.age;
+      mode = "0400";
+      owner = "gitea-runner";
+      group = "gitea-runner";
+    };
+
+    # Git-crypt symmetric key for dotfiles repo
+    git-crypt-key-dotfiles = {
+      file = ../secrets/git-crypt-key-dotfiles.age;
+      mode = "0400";
+      owner = "gitea-runner";
+      group = "gitea-runner";
+    };
+
+    # Git-crypt symmetric key for server-config repo
+    git-crypt-key-server-config = {
+      file = ../secrets/git-crypt-key-server-config.age;
+      mode = "0400";
+      owner = "gitea-runner";
+      group = "gitea-runner";
+    };
+
+    # Gitea Actions runner registration token
+    gitea-runner-token = {
+      file = ../secrets/gitea-runner-token.age;
+      mode = "0400";
+      owner = "gitea-runner";
+      group = "gitea-runner";
+    };
+
+    # llama-cpp API key for bearer token auth
+    llama-cpp-api-key = {
+      file = ../secrets/llama-cpp-api-key.age;
+      mode = "0400";
+      owner = "root";
+      group = "root";
+    };
+
+    # Harmonia binary cache signing key
+    harmonia-sign-key = {
+      file = ../secrets/harmonia-sign-key.age;
+      mode = "0400";
+      owner = "harmonia";
+      group = "harmonia";
+    };
+
+    # Caddy basic auth for nix binary cache (separate from main caddy_auth)
+    nix-cache-auth = {
+      file = ../secrets/nix-cache-auth.age;
+      mode = "0400";
+      owner = "caddy";
+      group = "caddy";
+    };
  };
 }
--- a/modules/hardware.nix
+++ b/modules/hardware.nix
@@ -5,6 +5,20 @@
  service_configs,
  ...
 }:
+let
+  hddTuneIosched = pkgs.writeShellScript "hdd-tune-iosched" ''
+    # Called by udev with the partition kernel name (e.g. sdb1).
+    # Derives the parent disk and applies mq-deadline iosched params.
+    parent=''${1%%[0-9]*}
+    dev="/sys/block/$parent"
+    [ -d "$dev/queue/iosched" ] || exit 0
+    echo 500 > "$dev/queue/iosched/read_expire"
+    echo 15000 > "$dev/queue/iosched/write_expire"
+    echo 128 > "$dev/queue/iosched/fifo_batch"
+    echo 16 > "$dev/queue/iosched/writes_starved"
+    echo 4096 > "$dev/queue/max_sectors_kb" 2>/dev/null || true
+  '';
+in
 {
  boot.initrd.availableKernelModules = [
    "xhci_pci"
@@ -22,56 +36,27 @@
  hardware.cpu.amd.updateMicrocode = true;
  hardware.enableRedistributableFirmware = true;

-  # HDD I/O tuning for torrent seeding workload (high-concurrency random reads).
+  # HDD I/O tuning for torrent seeding workload (high-concurrency random reads)
+  # sharing the pool with latency-sensitive sequential reads (Jellyfin playback).
  #
  # mq-deadline sorts requests into elevator sweeps, reducing seek distance.
-  # Aggressive deadlines (15s) let the scheduler accumulate more ops before dispatching,
-  # maximizing coalescence — latency is irrelevant since torrent peers tolerate 30-60s.
+  # read_expire=500ms keeps reads bounded so a Jellyfin segment can't queue for
+  # seconds behind a torrent burst; write_expire=15s lets the scheduler batch
+  # writes for coalescence (torrent writes are async and tolerate delay).
+  # The bulk of read coalescence already happens above the scheduler via ZFS
+  # aggregation (zfs_vdev_aggregation_limit=4M, read_gap_limit=128K,
+  # async_read_max=32), so the scheduler deadline only needs to be large enough
+  # to keep the elevator sweep coherent -- 500ms is plenty on rotational disks.
  # fifo_batch=128 keeps sweeps long; writes_starved=16 heavily favors reads.
  # 4 MiB readahead matches libtorrent piece extent affinity for sequential prefetch.
  #
-  # This runs as a systemd oneshot rather than udev rules because the NixOS ZFS module
-  # hardcodes a udev rule that forces scheduler="none" on all ZFS member partitions'
-  # parent disks, overriding any scheduler set via udev on the disk event.
-  systemd.services.hdd-io-tuning = {
-    description = "HDD I/O scheduler and queue tuning";
-    after = [
-      "zfs-import.target"
-      "systemd-udev-settle.service"
-    ];
-    wantedBy = [ "multi-user.target" ];
-    serviceConfig = {
-      Type = "oneshot";
-      RemainAfterExit = true;
-    };
-    path = with pkgs; [
-      coreutils
-      gawk
-      zfs
-    ];
-    script = ''
-      # Only tune disks in the hdds pool — not all rotational disks.
-      # zpool status gives by-id device names; we resolve to /sys/block/<name>.
-      zpool status hdds | awk '/^\t/ && $1 ~ /^(ata-|nvme-|scsi-)/ {print $1}' | while read -r id; do
-        link="/dev/disk/by-id/$id"
-        [ -L "$link" ] || continue
-        name=$(basename "$(readlink -f "$link")")
-        dev="/sys/block/$name"
-        [ -d "$dev" ] || continue
-
-        echo mq-deadline > "$dev/queue/scheduler"
-        echo 4096 > "$dev/queue/read_ahead_kb"
-        echo 512 > "$dev/queue/nr_requests"
-
-        echo 15000 > "$dev/queue/iosched/read_expire"
-        echo 15000 > "$dev/queue/iosched/write_expire"
-        echo 128 > "$dev/queue/iosched/fifo_batch"
-        echo 16 > "$dev/queue/iosched/writes_starved"
-
-        echo 4096 > "$dev/queue/max_sectors_kb" 2>/dev/null || true
-
-        echo "Tuned $id -> $name: mq-deadline, 4M readahead, 15s deadlines"
-      done
-    '';
-  };
+  # The NixOS ZFS module hardcodes a udev rule that forces scheduler="none" on all
+  # ZFS member partitions' parent disks (on both add AND change events). We counter
+  # it with lib.mkAfter so our rule appears after theirs in 99-local.rules — our
+  # rule matches the same partition events and sets mq-deadline back, then a RUN
+  # script applies the iosched params. Only targets rotational, non-removable disks
+  # (i.e. HDDs, not SSDs or USB).
+  services.udev.extraRules = lib.mkAfter ''
+    ACTION=="add|change", KERNEL=="sd[a-z]*[0-9]*", ENV{ID_FS_TYPE}=="zfs_member", ATTR{../queue/rotational}=="1", ATTR{../removable}=="0", ATTR{../queue/scheduler}="mq-deadline", ATTR{../queue/read_ahead_kb}="4096", ATTR{../queue/nr_requests}="512", RUN+="${hddTuneIosched} %k"
+  '';
 }
--- a/modules/impermanence.nix
+++ b/modules/impermanence.nix
@@ -24,6 +24,7 @@
      # ZFS cache directory - persisting the directory instead of the file
      # avoids "device busy" errors when ZFS atomically updates the cache
      "/etc/zfs"
+      "/var/lib/gitea-runner"
    ];

    files = [
--- a/modules/lib.nix
+++ b/modules/lib.nix
@@ -10,20 +10,16 @@ inputs.nixpkgs.lib.extend (
    lib = prev;
  in
  {
-    # stolen from: https://stackoverflow.com/a/42398526
    optimizeWithFlags =
      pkg: flags:
-      lib.overrideDerivation pkg (
-        old:
-        let
-          newflags = lib.foldl' (acc: x: "${acc} ${x}") "" flags;
-          oldflags = if (lib.hasAttr "NIX_CFLAGS_COMPILE" old) then "${old.NIX_CFLAGS_COMPILE}" else "";
-        in
-        {
-          NIX_CFLAGS_COMPILE = "${oldflags} ${newflags}";
-          # stdenv = pkgs.clang19Stdenv;
-        }
-      );
+      pkg.overrideAttrs (old: {
+        env = (old.env or { }) // {
+          NIX_CFLAGS_COMPILE =
+            (old.env.NIX_CFLAGS_COMPILE or old.NIX_CFLAGS_COMPILE or "")
+            + " "
+            + (lib.concatStringsSep " " flags);
+        };
+      });

    optimizePackage =
      pkg:
@@ -63,8 +59,12 @@ inputs.nixpkgs.lib.extend (
      { pkgs, config, ... }:
      {
        systemd.services."${serviceName}-mounts" = {
-          wants = [ "zfs.target" ] ++ lib.optionals (zpool != "") [ "zfs-import-${zpool}.service" ];
-          after = lib.optionals (zpool != "") [ "zfs-import-${zpool}.service" ];
+          wants = [
+            "zfs.target"
+            "zfs-mount.service"
+          ]
+          ++ lib.optionals (zpool != "") [ "zfs-import-${zpool}.service" ];
+          after = [ "zfs-mount.service" ] ++ lib.optionals (zpool != "") [ "zfs-import-${zpool}.service" ];
          before = [ "${serviceName}.service" ];

          serviceConfig = {
@@ -180,5 +180,108 @@ inputs.nixpkgs.lib.extend (
          after = [ "${serviceName}-file-perms.service" ];
        };
      };
+    # Creates a Caddy virtualHost with reverse_proxy to a local or VPN-namespaced port.
+    # Use `subdomain` for "<name>.${domain}" or `domain` for a full custom domain.
+    # Exactly one of `subdomain` or `domain` must be provided.
+    mkCaddyReverseProxy =
+      {
+        subdomain ? null,
+        domain ? null,
+        port,
+        auth ? false,
+        vpn ? false,
+      }:
+      assert (subdomain != null) != (domain != null);
+      { config, ... }:
+      let
+        vhostDomain = if domain != null then domain else "${subdomain}.${service_configs.https.domain}";
+        upstream =
+          if vpn then
+            "${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString port}"
+          else
+            ":${builtins.toString port}";
+      in
+      {
+        services.caddy.virtualHosts."${vhostDomain}".extraConfig = lib.concatStringsSep "\n" (
+          lib.optional auth "import ${config.age.secrets.caddy_auth.path}" ++ [ "reverse_proxy ${upstream}" ]
+        );
+      };
+
+    # Creates a fail2ban jail with systemd journal backend.
+    # Covers the common pattern: journal-based detection, http/https ports, default thresholds.
+    mkFail2banJail =
+      {
+        name,
+        unitName ? "${name}.service",
+        failregex,
+      }:
+      { ... }:
+      {
+        services.fail2ban.jails.${name} = {
+          enabled = true;
+          settings = {
+            backend = "systemd";
+            port = "http,https";
+            # defaults: maxretry=5, findtime=10m, bantime=10m
+          };
+          filter.Definition = {
+            inherit failregex;
+            ignoreregex = "";
+            journalmatch = "_SYSTEMD_UNIT=${unitName}";
+          };
+        };
+      };
+
+    # Creates a hardened Grafana annotation daemon service.
+    # Provides DynamicUser, sandboxing, state directory, and GRAFANA_URL/STATE_FILE automatically.
+    mkGrafanaAnnotationService =
+      {
+        name,
+        description,
+        script,
+        after ? [ ],
+        environment ? { },
+        loadCredential ? null,
+      }:
+      {
+        systemd.services."${name}-annotations" = {
+          inherit description;
+          after = [
+            "network.target"
+            "grafana.service"
+          ]
+          ++ after;
+          wantedBy = [ "multi-user.target" ];
+          serviceConfig = {
+            ExecStart = "${pkgs.python3}/bin/python3 ${script}";
+            Restart = "always";
+            RestartSec = "10s";
+            DynamicUser = true;
+            StateDirectory = "${name}-annotations";
+            NoNewPrivileges = true;
+            ProtectSystem = "strict";
+            ProtectHome = true;
+            PrivateTmp = true;
+            RestrictAddressFamilies = [
+              "AF_INET"
+              "AF_INET6"
+            ];
+            MemoryDenyWriteExecute = true;
+          }
+          // lib.optionalAttrs (loadCredential != null) {
+            LoadCredential = loadCredential;
+          };
+          environment = {
+            GRAFANA_URL = "http://127.0.0.1:${toString service_configs.ports.private.grafana.port}";
+            STATE_FILE = "/var/lib/${name}-annotations/state.json";
+          }
+          // environment;
+        };
+      };
+
+    # Shell command to extract an API key from an *arr config.xml file.
+    # Returns a string suitable for $() command substitution in shell scripts.
+    extractArrApiKey =
+      configXmlPath: "${lib.getExe pkgs.gnugrep} -oP '(?<=<ApiKey>)[^<]+' ${configXmlPath}";
  }
 )
--- a/modules/overlays.nix
+++ b/modules/overlays.nix
@@ -43,4 +43,36 @@ final: prev: {
      }
    );
  };
+
+  jellyfin-exporter = prev.buildGoModule rec {
+    pname = "jellyfin-exporter";
+    version = "unstable-2025-03-27";
+    src = prev.fetchFromGitHub {
+      owner = "rebelcore";
+      repo = "jellyfin_exporter";
+      rev = "8e3970cb1bdf3cb21fac099c13072bb7c1b20cf9";
+      hash = "sha256-wDnhepYj1MyLRZlwKfmwf4xiEEL3mgQY6V+7TnBd0MY=";
+    };
+    vendorHash = "sha256-e08u10e/wNapNZSsD/fGVN9ybMHe3sW0yDIOqI8ZcYs=";
+    # upstream tests require a running Jellyfin instance
+    doCheck = false;
+    meta.mainProgram = "jellyfin_exporter";
+  };
+
+  igpu-exporter = prev.buildGoModule rec {
+    pname = "igpu-exporter";
+    version = "unstable-2025-03-27";
+    src = prev.fetchFromGitHub {
+      owner = "mike1808";
+      repo = "igpu-exporter";
+      rev = "db2dace1a895c2b950f6d3ba1a2e46729251d124";
+      hash = "sha256-xWTiu26UzTZIK/6jeda+x6VePUgoWTS0AekejFdgFWs=";
+    };
+    vendorHash = "sha256-oeCSKwDKVwvYQ1fjXXTwQSXNl/upDE3WAAk680vqh3U=";
+    subPackages = [ "cmd" ];
+    postInstall = ''
+      mv $out/bin/cmd $out/bin/igpu-exporter
+    '';
+    meta.mainProgram = "igpu-exporter";
+  };
 }
--- a/modules/power.nix
+++ b/modules/power.nix
@@ -0,0 +1,41 @@
+{
+  ...
+}:
+{
+  powerManagement = {
+    enable = true;
+    cpuFreqGovernor = "powersave";
+  };
+
+  # Always-on server: disable all sleep targets.
+  systemd.targets = {
+    sleep.enable = false;
+    suspend.enable = false;
+    hibernate.enable = false;
+    hybrid-sleep.enable = false;
+  };
+
+  boot.kernelParams = [
+    # Disable NMI watchdog at boot. Eliminates periodic perf-counter interrupts
+    # across all cores (~1 W). Safe: apcupsd provides hardware hang detection
+    # via UPS, and softlockup watchdog remains active.
+    "nmi_watchdog=0"
+
+    # Route kernel work items to already-busy CPUs rather than waking idle ones.
+    # Reduces C-state exit frequency at the cost of slightly higher latency on
+    # work items -- irrelevant for a server whose latency-sensitive paths are
+    # all in userspace (caddy, jellyfin).
+    "workqueue.power_efficient=1"
+  ];
+
+  boot.kernel.sysctl = {
+    # Belt-and-suspenders: also set via boot param, but sysctl ensures it
+    # stays off if anything re-enables it at runtime.
+    "kernel.nmi_watchdog" = 0;
+  };
+
+  # Server has no audio consumers. Power-gate the HDA codec at module load.
+  boot.extraModprobeConfig = ''
+    options snd_hda_intel power_save=1 power_save_controller=Y
+  '';
+}
--- a/modules/security.nix
+++ b/modules/security.nix
@@ -13,6 +13,89 @@
  # disable coredumps
  systemd.coredump.enable = false;

+  # Needed for Nix sandbox UID/GID mapping inside derivation builds.
+  # See https://github.com/NixOS/nixpkgs/issues/287194
+  security.unprivilegedUsernsClone = true;
+
+  # Disable kexec to prevent replacing the running kernel at runtime.
+  security.protectKernelImage = true;
+
+  # Kernel hardening boot parameters. These recover most of the runtime-
+  # configurable protections that the linux-hardened patchset provided.
+  boot.kernelParams = [
+    # Zero all page allocator pages on free / alloc. Prevents info leaks
+    # and use-after-free from seeing stale data. Modest CPU overhead.
+    "init_on_alloc=1"
+    "init_on_free=1"
+
+    # Prevent SLUB allocator from merging caches with similar size/flags.
+    # Keeps different kernel object types in separate slabs, making heap
+    # exploitation (type confusion, spray, use-after-free) significantly harder.
+    "slab_nomerge"
+
+    # Randomize order of pages returned by the buddy allocator.
+    "page_alloc.shuffle=1"
+
+    # Disable debugfs entirely (exposes kernel internals).
+    "debugfs=off"
+
+    # Disable legacy vsyscall emulation (unused by any modern glibc).
+    "vsyscall=none"
+
+    # Strict IOMMU TLB invalidation (no batching). Prevents DMA-capable
+    # devices from accessing stale mappings after unmap.
+    "iommu.strict=1"
+  ];
+
+  boot.kernel.sysctl = {
+    # Immediately reboot on kernel oops (don't leave a compromised
+    # kernel running). Negative value = reboot without delay.
+    "kernel.panic" = -1;
+
+    # Hide kernel pointers from all processes, including CAP_SYSLOG.
+    # Prevents info leaks used to defeat KASLR.
+    "kernel.kptr_restrict" = 2;
+
+    # Disable bpf() JIT compiler (eliminates JIT spray attack vector).
+    "net.core.bpf_jit_enable" = false;
+
+    # Disable ftrace (kernel function tracer) at runtime.
+    "kernel.ftrace_enabled" = false;
+
+    # Strict reverse-path filtering: drop packets arriving on an interface
+    # where the source address isn't routable back via that interface.
+    "net.ipv4.conf.all.rp_filter" = 1;
+    "net.ipv4.conf.default.rp_filter" = 1;
+    "net.ipv4.conf.all.log_martians" = true;
+    "net.ipv4.conf.default.log_martians" = true;
+
+    # Ignore ICMP redirects (prevents route table poisoning).
+    "net.ipv4.conf.all.accept_redirects" = false;
+    "net.ipv4.conf.all.secure_redirects" = false;
+    "net.ipv4.conf.default.accept_redirects" = false;
+    "net.ipv4.conf.default.secure_redirects" = false;
+    "net.ipv6.conf.all.accept_redirects" = false;
+    "net.ipv6.conf.default.accept_redirects" = false;
+
+    # Don't send ICMP redirects (we are not a router).
+    "net.ipv4.conf.all.send_redirects" = false;
+    "net.ipv4.conf.default.send_redirects" = false;
+
+    # Ignore broadcast ICMP (SMURF amplification mitigation).
+    "net.ipv4.icmp_echo_ignore_broadcasts" = true;
+
+    # Filesystem hardening: prevent hardlink/symlink-based attacks.
+    # protected_hardlinks/symlinks: block unprivileged creation of hard/symlinks
+    # to files the user doesn't own (prevents TOCTOU privilege escalation).
+    # protected_fifos/regular (level 2): restrict opening FIFOs and regular files
+    # in world-writable sticky directories to owner/group match only.
+    # Also required for systemd-tmpfiles to chmod hardlinked files.
+    "fs.protected_hardlinks" = true;
+    "fs.protected_symlinks" = true;
+    "fs.protected_fifos" = 2;
+    "fs.protected_regular" = 2;
+  };
+
  services = {
    dbus.implementation = "broker";
    /*
--- a/modules/zfs.nix
+++ b/modules/zfs.nix
@@ -1,15 +1,39 @@
 {
  config,
+  lib,
  service_configs,
  pkgs,
  ...
 }:
+let
+  # Total RAM in bytes (from /proc/meminfo: 65775836 KiB).
+  totalRamBytes = 65775836 * 1024;
+
+  # Hugepage reservations that the kernel carves out before ZFS can use them.
+  hugepages2mBytes = service_configs.hugepages_2m.total_pages * 2 * 1024 * 1024;
+  hugepages1gBytes = 3 * 1024 * 1024 * 1024; # 3x 1G pages for RandomX (xmrig.nix)
+  totalHugepageBytes = hugepages2mBytes + hugepages1gBytes;
+
+  # ARC max: 60% of RAM remaining after hugepages. Leaves headroom for
+  # application RSS (PostgreSQL, qBittorrent, Jellyfin, Grafana, etc.),
+  # kernel slabs, and page cache.
+  arcMaxBytes = (totalRamBytes - totalHugepageBytes) * 60 / 100;
+in
 {
-  boot.zfs.package = pkgs.zfs;
+  boot.zfs.package = pkgs.zfs_2_4;
  boot.initrd.kernelModules = [ "zfs" ];

  boot.kernelParams = [
-    "zfs.zfs_txg_timeout=120" # longer TXG open time = larger sequential writes
+    # 120s TXG timeout: batch more dirty data per transaction group so the
+    # HDD pool (hdds) writes larger, sequential I/Os instead of many small syncs.
+    # This is a global setting (no per-pool control); the SSD pool (tank) syncs
+    # infrequently but handles it fine since SSDs don't suffer from seek overhead.
+    "zfs.zfs_txg_timeout=120"
+
+    # Cap ARC to prevent it from claiming memory reserved for hugepages.
+    # Without this, ZFS auto-sizes c_max to ~62 GiB on a 64 GiB system,
+    # ignoring the 11.5 GiB of hugepage reservations.
+    "zfs.zfs_arc_max=${toString arcMaxBytes}"

    # vdev I/O scheduler: feed more concurrent reads to the block scheduler so
    # mq-deadline has a larger pool of requests to sort and merge into elevator sweeps.
--- a/patches/nixpkgs/0001-firefox-syncserver-add-postgresql-backend-support.patch
+++ b/patches/nixpkgs/0001-firefox-syncserver-add-postgresql-backend-support.patch
--- a/patches/nixpkgs/0002-jellyfin-add-declarative-network-xml-options.patch
+++ b/patches/nixpkgs/0002-jellyfin-add-declarative-network-xml-options.patch
@@ -0,0 +1,443 @@
+From f0582558f0a8b0ef543b3251c4a07afab89fde63 Mon Sep 17 00:00:00 2001
+From: Simon Gardling <titaniumtown@proton.me>
+Date: Fri, 17 Apr 2026 19:37:11 -0400
+Subject: [PATCH] nixos/jellyfin: add declarative network.xml options
+
+Adds services.jellyfin.network.* (baseUrl, ports, IPv4/6, LAN subnets,
+known proxies, remote IP filter, etc.) and services.jellyfin.forceNetworkConfig,
+mirroring the existing hardwareAcceleration / forceEncodingConfig pattern.
+
+Motivation: running Jellyfin behind a reverse proxy requires configuring
+KnownProxies (so the real client IP is extracted from X-Forwarded-For)
+and LocalNetworkSubnets (so LAN clients are correctly classified and not
+subject to RemoteClientBitrateLimit). These settings previously had no
+declarative option -- they could only be set via the web dashboard or
+by hand-editing network.xml, with no guarantee they would survive a
+reinstall or be consistent across deployments.
+
+Implementation:
+- Adds a networkXmlText template alongside the existing encodingXmlText.
+- Factors the force-vs-soft install logic out of preStart into a
+  small 'manage_config_xml' shell helper; encoding.xml and network.xml
+  now share the same install/backup semantics.
+- Extends the VM test with a machineWithNetworkConfig node and a
+  subtest that verifies the declared values land in network.xml,
+  Jellyfin parses them at startup, and the backup-on-overwrite path
+  works (same shape as the existing 'Force encoding config' subtest).
+---
+ nixos/modules/services/misc/jellyfin.nix | 303 ++++++++++++++++++++---
+ nixos/tests/jellyfin.nix                 |  50 ++++
+ 2 files changed, 317 insertions(+), 36 deletions(-)
+
+diff --git a/nixos/modules/services/misc/jellyfin.nix b/nixos/modules/services/misc/jellyfin.nix
+index 5c08fc478e45..387da907c652 100644
+--- a/nixos/modules/services/misc/jellyfin.nix
+++ b/nixos/modules/services/misc/jellyfin.nix
+@@ -26,8 +26,10 @@ let
+     bool
+     enum
+     ints
+    listOf
+     nullOr
+     path
+    port
+     str
+     submodule
+     ;
+@@ -68,6 +70,41 @@ let
+     </EncodingOptions>
+   '';
+   encodingXmlFile = pkgs.writeText "encoding.xml" encodingXmlText;
+  stringListToXml =
+    tag: items:
+    if items == [ ] then
+      "<${tag} />"
+    else
+      "<${tag}>\n    ${
+        concatMapStringsSep "\n    " (item: "<string>${escapeXML item}</string>") items
+      }\n  </${tag}>";
+  networkXmlText = ''
+    <?xml version="1.0" encoding="utf-8"?>
+    <NetworkConfiguration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
+      <BaseUrl>${escapeXML cfg.network.baseUrl}</BaseUrl>
+      <EnableHttps>${boolToString cfg.network.enableHttps}</EnableHttps>
+      <RequireHttps>${boolToString cfg.network.requireHttps}</RequireHttps>
+      <InternalHttpPort>${toString cfg.network.internalHttpPort}</InternalHttpPort>
+      <InternalHttpsPort>${toString cfg.network.internalHttpsPort}</InternalHttpsPort>
+      <PublicHttpPort>${toString cfg.network.publicHttpPort}</PublicHttpPort>
+      <PublicHttpsPort>${toString cfg.network.publicHttpsPort}</PublicHttpsPort>
+      <AutoDiscovery>${boolToString cfg.network.autoDiscovery}</AutoDiscovery>
+      <EnableUPnP>${boolToString cfg.network.enableUPnP}</EnableUPnP>
+      <EnableIPv4>${boolToString cfg.network.enableIPv4}</EnableIPv4>
+      <EnableIPv6>${boolToString cfg.network.enableIPv6}</EnableIPv6>
+      <EnableRemoteAccess>${boolToString cfg.network.enableRemoteAccess}</EnableRemoteAccess>
+      ${stringListToXml "LocalNetworkSubnets" cfg.network.localNetworkSubnets}
+      ${stringListToXml "LocalNetworkAddresses" cfg.network.localNetworkAddresses}
+      ${stringListToXml "KnownProxies" cfg.network.knownProxies}
+      <IgnoreVirtualInterfaces>${boolToString cfg.network.ignoreVirtualInterfaces}</IgnoreVirtualInterfaces>
+      ${stringListToXml "VirtualInterfaceNames" cfg.network.virtualInterfaceNames}
+      <EnablePublishedServerUriByRequest>${boolToString cfg.network.enablePublishedServerUriByRequest}</EnablePublishedServerUriByRequest>
+      ${stringListToXml "PublishedServerUriBySubnet" cfg.network.publishedServerUriBySubnet}
+      ${stringListToXml "RemoteIPFilter" cfg.network.remoteIPFilter}
+      <IsRemoteIPFilterBlacklist>${boolToString cfg.network.isRemoteIPFilterBlacklist}</IsRemoteIPFilterBlacklist>
+    </NetworkConfiguration>
+  '';
+  networkXmlFile = pkgs.writeText "network.xml" networkXmlText;
+   codecListToType =
+     desc: list:
+     submodule {
+@@ -205,6 +242,196 @@ in
+         '';
+       };
+ 
+      network = {
+        baseUrl = mkOption {
+          type = str;
+          default = "";
+          example = "/jellyfin";
+          description = ''
+            Prefix added to Jellyfin's internal URLs when it sits behind a reverse proxy at a sub-path.
+            Leave empty when Jellyfin is served at the root of its host.
+          '';
+        };
+
+        enableHttps = mkOption {
+          type = bool;
+          default = false;
+          description = ''
+            Serve HTTPS directly from Jellyfin. Usually unnecessary when terminating TLS in a reverse proxy.
+          '';
+        };
+
+        requireHttps = mkOption {
+          type = bool;
+          default = false;
+          description = ''
+            Redirect plaintext HTTP requests to HTTPS. Only meaningful when {option}`enableHttps` is true.
+          '';
+        };
+
+        internalHttpPort = mkOption {
+          type = port;
+          default = 8096;
+          description = "TCP port Jellyfin binds for HTTP.";
+        };
+
+        internalHttpsPort = mkOption {
+          type = port;
+          default = 8920;
+          description = "TCP port Jellyfin binds for HTTPS. Only used when {option}`enableHttps` is true.";
+        };
+
+        publicHttpPort = mkOption {
+          type = port;
+          default = 8096;
+          description = "HTTP port Jellyfin advertises in server discovery responses and published URIs.";
+        };
+
+        publicHttpsPort = mkOption {
+          type = port;
+          default = 8920;
+          description = "HTTPS port Jellyfin advertises in server discovery responses and published URIs.";
+        };
+
+        autoDiscovery = mkOption {
+          type = bool;
+          default = true;
+          description = "Respond to LAN client auto-discovery broadcasts (UDP 7359).";
+        };
+
+        enableUPnP = mkOption {
+          type = bool;
+          default = false;
+          description = "Attempt to open the public ports on the router via UPnP.";
+        };
+
+        enableIPv4 = mkOption {
+          type = bool;
+          default = true;
+          description = "Listen on IPv4.";
+        };
+
+        enableIPv6 = mkOption {
+          type = bool;
+          default = true;
+          description = "Listen on IPv6.";
+        };
+
+        enableRemoteAccess = mkOption {
+          type = bool;
+          default = true;
+          description = ''
+            Allow connections from clients outside the subnets listed in {option}`localNetworkSubnets`.
+            When false, Jellyfin rejects non-local requests regardless of reverse proxy configuration.
+          '';
+        };
+
+        localNetworkSubnets = mkOption {
+          type = listOf str;
+          default = [ ];
+          example = [
+            "192.168.1.0/24"
+            "10.0.0.0/8"
+          ];
+          description = ''
+            CIDR ranges (or bare IPs) that Jellyfin classifies as the local network.
+            Clients originating from these ranges -- as seen after {option}`knownProxies` X-Forwarded-For
+            unwrapping -- are not subject to {option}`services.jellyfin` remote-client bitrate limits.
+          '';
+        };
+
+        localNetworkAddresses = mkOption {
+          type = listOf str;
+          default = [ ];
+          example = [ "192.168.1.50" ];
+          description = ''
+            Specific interface addresses Jellyfin binds to. Leave empty to bind all interfaces.
+          '';
+        };
+
+        knownProxies = mkOption {
+          type = listOf str;
+          default = [ ];
+          example = [ "127.0.0.1" ];
+          description = ''
+            Addresses of reverse proxies trusted to forward the real client IP via `X-Forwarded-For`.
+            Without this, Jellyfin sees the proxy's address for every request and cannot apply
+            {option}`localNetworkSubnets` classification to the true client.
+          '';
+        };
+
+        ignoreVirtualInterfaces = mkOption {
+          type = bool;
+          default = true;
+          description = "Skip virtual network interfaces (matching {option}`virtualInterfaceNames`) during auto-bind.";
+        };
+
+        virtualInterfaceNames = mkOption {
+          type = listOf str;
+          default = [ "veth" ];
+          description = "Interface name prefixes treated as virtual when {option}`ignoreVirtualInterfaces` is true.";
+        };
+
+        enablePublishedServerUriByRequest = mkOption {
+          type = bool;
+          default = false;
+          description = ''
+            Derive the server's public URI from the incoming request's Host header instead of any
+            configured {option}`publishedServerUriBySubnet` entry.
+          '';
+        };
+
+        publishedServerUriBySubnet = mkOption {
+          type = listOf str;
+          default = [ ];
+          example = [ "192.168.1.0/24=http://jellyfin.lan:8096" ];
+          description = ''
+            Per-subnet overrides for the URI Jellyfin advertises to clients, in `subnet=uri` form.
+          '';
+        };
+
+        remoteIPFilter = mkOption {
+          type = listOf str;
+          default = [ ];
+          example = [ "203.0.113.0/24" ];
+          description = ''
+            IPs or CIDRs used as the allow- or denylist for remote access.
+            Behaviour is controlled by {option}`isRemoteIPFilterBlacklist`.
+          '';
+        };
+
+        isRemoteIPFilterBlacklist = mkOption {
+          type = bool;
+          default = false;
+          description = ''
+            When true, {option}`remoteIPFilter` is a denylist; when false, it is an allowlist
+            (and an empty list allows all remote addresses).
+          '';
+        };
+      };
+
+      forceNetworkConfig = mkOption {
+        type = bool;
+        default = false;
+        description = ''
+          Whether to overwrite Jellyfin's `network.xml` configuration file on each service start.
+
+          When enabled, the network configuration specified in {option}`services.jellyfin.network`
+          is applied on every service restart. A backup of the existing `network.xml` will be
+          created at `network.xml.backup-$timestamp`.
+
+          ::: {.warning}
+          Enabling this option means that any changes made to networking settings through
+          Jellyfin's web dashboard will be lost on the next service restart. The NixOS configuration
+          becomes the single source of truth for network settings.
+          :::
+
+          When disabled (the default), the network configuration is only written if no `network.xml`
+          exists yet. This allows settings to be changed through Jellyfin's web dashboard and persist
+          across restarts, but means the NixOS configuration options will be ignored after the initial setup.
+        '';
+      };
+
+       transcoding = {
+         maxConcurrentStreams = mkOption {
+           type = nullOr ints.positive;
+@@ -384,46 +611,50 @@ in
+         wants = [ "network-online.target" ];
+         wantedBy = [ "multi-user.target" ];
+ 
+-        preStart = mkIf cfg.hardwareAcceleration.enable (
+-          ''
+-            configDir=${escapeShellArg cfg.configDir}
+-            encodingXml="$configDir/encoding.xml"
+-          ''
+-          + (
+-            if cfg.forceEncodingConfig then
+-              ''
+-                if [[ -e $encodingXml ]]; then
+        preStart =
+          let
+            # manage_config_xml <source> <destination> <force> <description>
+            #
+            # Installs a NixOS-declared XML config at <destination>, preserving
+            # any existing file as a timestamped backup when <force> is true.
+            # With <force>=false, leaves existing files untouched and warns if
+            # the on-disk content differs from the declared content.
+            helper = ''
+              manage_config_xml() {
+                local src="$1" dest="$2" force="$3" desc="$4"
+                if [[ -e "$dest" ]]; then
+                   # this intentionally removes trailing newlines
+-                  currentText="$(<"$encodingXml")"
+-                  configuredText="$(<${encodingXmlFile})"
+-                  if [[ $currentText == "$configuredText" ]]; then
+-                    # don't need to do anything
+-                    exit 0
+-                  else
+-                    encodingXmlBackup="$configDir/encoding.xml.backup-$(date -u +"%FT%H_%M_%SZ")"
+-                    mv --update=none-fail -T "$encodingXml" "$encodingXmlBackup"
+                  local currentText configuredText
+                  currentText="$(<"$dest")"
+                  configuredText="$(<"$src")"
+                  if [[ "$currentText" == "$configuredText" ]]; then
+                    return 0
+                   fi
+-                fi
+-                cp --update=none-fail -T ${encodingXmlFile} "$encodingXml"
+-                chmod u+w "$encodingXml"
+-              ''
+-            else
+-              ''
+-                if [[ -e $encodingXml ]]; then
+-                  # this intentionally removes trailing newlines
+-                  currentText="$(<"$encodingXml")"
+-                  configuredText="$(<${encodingXmlFile})"
+-                  if [[ $currentText != "$configuredText" ]]; then
+-                    echo "WARN: $encodingXml already exists and is different from the configured settings. transcoding options NOT applied." >&2
+-                    echo "WARN: Set config.services.jellyfin.forceEncodingConfig = true to override." >&2
+                  if [[ "$force" == true ]]; then
+                    local backup
+                    backup="$dest.backup-$(date -u +"%FT%H_%M_%SZ")"
+                    mv --update=none-fail -T "$dest" "$backup"
+                  else
+                    echo "WARN: $dest already exists and is different from the configured settings. $desc options NOT applied." >&2
+                    echo "WARN: Set the corresponding force*Config option to override." >&2
+                    return 0
+                   fi
+-                else
+-                  cp --update=none-fail -T ${encodingXmlFile} "$encodingXml"
+-                  chmod u+w "$encodingXml"
+                 fi
+-              ''
+-          )
+-        );
+                cp --update=none-fail -T "$src" "$dest"
+                chmod u+w "$dest"
+              }
+              configDir=${escapeShellArg cfg.configDir}
+            '';
+          in
+          (
+            helper
+            + optionalString cfg.hardwareAcceleration.enable ''
+              manage_config_xml ${encodingXmlFile} "$configDir/encoding.xml" ${boolToString cfg.forceEncodingConfig} transcoding
+            ''
+            + ''
+              manage_config_xml ${networkXmlFile} "$configDir/network.xml" ${boolToString cfg.forceNetworkConfig} network
+            ''
+          );
+ 
+         # This is mostly follows: https://github.com/jellyfin/jellyfin/blob/master/fedora/jellyfin.service
+         # Upstream also disable some hardenings when running in LXC, we do the same with the isContainer option
+diff --git a/nixos/tests/jellyfin.nix b/nixos/tests/jellyfin.nix
+index 4896c13d4eca..0c9191960f78 100644
+--- a/nixos/tests/jellyfin.nix
+++ b/nixos/tests/jellyfin.nix
+@@ -63,6 +63,26 @@
+       environment.systemPackages = with pkgs; [ ffmpeg ];
+       virtualisation.diskSize = 3 * 1024;
+     };
+
+    machineWithNetworkConfig = {
+      services.jellyfin = {
+        enable = true;
+        forceNetworkConfig = true;
+        network = {
+          localNetworkSubnets = [
+            "192.168.1.0/24"
+            "10.0.0.0/8"
+          ];
+          knownProxies = [ "127.0.0.1" ];
+          enableUPnP = false;
+          enableIPv6 = false;
+          remoteIPFilter = [ "203.0.113.5" ];
+          isRemoteIPFilterBlacklist = true;
+        };
+      };
+      environment.systemPackages = with pkgs; [ ffmpeg ];
+      virtualisation.diskSize = 3 * 1024;
+    };
+   };
+ 
+   # Documentation of the Jellyfin API: https://api.jellyfin.org/
+@@ -122,6 +142,36 @@
+           # Verify the new encoding.xml does not have the marker (was overwritten)
+           machineWithForceConfig.fail("grep -q 'MARKER' /var/lib/jellyfin/config/encoding.xml")
+ 
+      # Test forceNetworkConfig and network.xml generation
+      with subtest("Force network config writes declared values and backs up on overwrite"):
+          wait_for_jellyfin(machineWithNetworkConfig)
+
+          # Verify network.xml exists and contains the declared values
+          machineWithNetworkConfig.succeed("test -f /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<string>192.168.1.0/24</string>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<string>10.0.0.0/8</string>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<string>127.0.0.1</string>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<string>203.0.113.5</string>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<IsRemoteIPFilterBlacklist>true</IsRemoteIPFilterBlacklist>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<EnableIPv6>false</EnableIPv6>' /var/lib/jellyfin/config/network.xml")
+          machineWithNetworkConfig.succeed("grep -F '<EnableUPnP>false</EnableUPnP>' /var/lib/jellyfin/config/network.xml")
+
+          # Stop service before modifying config
+          machineWithNetworkConfig.succeed("systemctl stop jellyfin.service")
+
+          # Plant a marker so we can prove the backup-and-overwrite path runs
+          machineWithNetworkConfig.succeed("echo '<!-- NETMARKER -->' > /var/lib/jellyfin/config/network.xml")
+
+          # Restart the service to trigger the backup
+          machineWithNetworkConfig.succeed("systemctl restart jellyfin.service")
+          wait_for_jellyfin(machineWithNetworkConfig)
+
+          # Verify the marked content was preserved as a timestamped backup
+          machineWithNetworkConfig.succeed("grep -q 'NETMARKER' /var/lib/jellyfin/config/network.xml.backup-*")
+
+          # Verify the new network.xml does not have the marker (was overwritten)
+          machineWithNetworkConfig.fail("grep -q 'NETMARKER' /var/lib/jellyfin/config/network.xml")
+
+       auth_header = 'MediaBrowser Client="NixOS Integration Tests", DeviceId="1337", Device="Apple II", Version="20.09"'
+ 
+ 
+-- 
+2.53.0
+
--- a/secrets/ci-deploy-key.age
+++ b/secrets/ci-deploy-key.age
--- a/secrets/coturn-auth-secret.age
+++ b/secrets/coturn-auth-secret.age
--- a/secrets/coturn_static_auth_secret
+++ b/secrets/coturn_static_auth_secret
--- a/secrets/ddns-updater-config.age
+++ b/secrets/ddns-updater-config.age
--- a/secrets/git-crypt-key-dotfiles.age
+++ b/secrets/git-crypt-key-dotfiles.age
--- a/secrets/git-crypt-key-server-config.age
+++ b/secrets/git-crypt-key-server-config.age
--- a/secrets/gitea-runner-token.age
+++ b/secrets/gitea-runner-token.age
--- a/secrets/harmonia-sign-key.age
+++ b/secrets/harmonia-sign-key.age
--- a/secrets/llama-cpp-api-key.age
+++ b/secrets/llama-cpp-api-key.age
--- a/secrets/matrix-reg-token.age
+++ b/secrets/matrix-reg-token.age
--- a/secrets/matrix_reg_token
+++ b/secrets/matrix_reg_token
--- a/secrets/murmur-password-env.age
+++ b/secrets/murmur-password-env.age
--- a/secrets/murmur_password
+++ b/secrets/murmur_password
--- a/secrets/nix-cache-auth.age
+++ b/secrets/nix-cache-auth.age
--- a/secrets/njalla-api-token-env.age
+++ b/secrets/njalla-api-token-env.age
--- a/secrets/ntfy-alerts-topic.age
+++ b/secrets/ntfy-alerts-topic.age
--- a/secrets/slskd_env
+++ b/secrets/slskd_env
--- a/secrets/xmrig-wallet
+++ b/secrets/xmrig-wallet
--- a/service-configs.nix
+++ b/service-configs.nix
@@ -81,6 +81,12 @@ rec {
        port = 6011;
        proto = "tcp";
      };
+      # Webhook receiver for the Jellyfin-qBittorrent monitor — Jellyfin pushes
+      # playback events here so throttling reacts without waiting for the poll.
+      jellyfin_qbittorrent_monitor_webhook = {
+        port = 9898;
+        proto = "tcp";
+      };
      bitmagnet = {
        port = 3333;
        proto = "tcp";
@@ -153,6 +159,50 @@ rec {
        port = 8020;
        proto = "tcp";
      };
+      grafana = {
+        port = 3000;
+        proto = "tcp";
+      };
+      prometheus = {
+        port = 9090;
+        proto = "tcp";
+      };
+      prometheus_node = {
+        port = 9100;
+        proto = "tcp";
+      };
+      prometheus_apcupsd = {
+        port = 9162;
+        proto = "tcp";
+      };
+      llama_cpp = {
+        port = 6688;
+        proto = "tcp";
+      };
+      trilium = {
+        port = 8787;
+        proto = "tcp";
+      };
+      jellyfin_exporter = {
+        port = 9594;
+        proto = "tcp";
+      };
+      qbittorrent_exporter = {
+        port = 9561;
+        proto = "tcp";
+      };
+      igpu_exporter = {
+        port = 9563;
+        proto = "tcp";
+      };
+      prometheus_zfs = {
+        port = 9134;
+        proto = "tcp";
+      };
+      harmonia = {
+        port = 5500;
+        proto = "tcp";
+      };
    };
  };

@@ -189,6 +239,17 @@ rec {
  torrent = {
    SavePath = torrents_path;
    TempPath = torrents_path + "/incomplete";
+    categories = {
+      anime = torrents_path + "/anime";
+      archive = torrents_path + "/archive";
+      audiobooks = torrents_path + "/audiobooks";
+      books = torrents_path + "/books";
+      games = torrents_path + "/games";
+      movies = torrents_path + "/movies";
+      music = torrents_path + "/music";
+      musicals = torrents_path + "/musicals";
+      tvshows = torrents_path + "/tvshows";
+    };
  };

  jellyfin = {
@@ -212,6 +273,7 @@ rec {

  p2pool = {
    dataDir = services_dir + "/p2pool";
+    walletAddress = "49b6NT2k7fQHs8JvF7naUvchYwTQmRpoMMXb1KJTg5UcZVmyPJ7n6jgiH8DrvEsMg5GvMjJqPB1c1PTBAYtUTsbeHe5YMBx";
  };

  matrix = {
@@ -265,6 +327,15 @@ rec {
    domain = "firefox-sync.${https.domain}";
  };

+  grafana = {
+    dir = services_dir + "/grafana";
+    domain = "grafana.${https.domain}";
+  };
+
+  trilium = {
+    dataDir = services_dir + "/trilium";
+  };
+
  media = {
    moviesDir = torrents_path + "/media/movies";
    tvDir = torrents_path + "/media/tv";
--- a/services/arr/arr-search.nix
+++ b/services/arr/arr-search.nix
@@ -1,5 +1,6 @@
 {
  pkgs,
+  lib,
  service_configs,
  ...
 }:
@@ -12,7 +13,6 @@ let

  curl = "${pkgs.curl}/bin/curl";
  jq = "${pkgs.jq}/bin/jq";
-  grep = "${pkgs.gnugrep}/bin/grep";

  # Max items to search per cycle per category (missing + cutoff) per app
  maxPerCycle = 5;
@@ -20,8 +20,8 @@ let
  searchScript = pkgs.writeShellScript "arr-search" ''
    set -euo pipefail

-    RADARR_KEY=$(${grep} -oP '(?<=<ApiKey>)[^<]+' ${radarrConfig})
-    SONARR_KEY=$(${grep} -oP '(?<=<ApiKey>)[^<]+' ${sonarrConfig})
+    RADARR_KEY=$(${lib.extractArrApiKey radarrConfig})
+    SONARR_KEY=$(${lib.extractArrApiKey sonarrConfig})

    search_radarr() {
      local endpoint="$1"
--- a/services/arr/bazarr.nix
+++ b/services/arr/bazarr.nix
@@ -16,6 +16,11 @@
    (lib.serviceFilePerms "bazarr" [
      "Z ${service_configs.bazarr.dataDir} 0700 ${config.services.bazarr.user} ${config.services.bazarr.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "bazarr";
+      port = service_configs.ports.private.bazarr.port;
+      auth = true;
+    })
  ];

  services.bazarr = {
@@ -23,11 +28,6 @@
    listenPort = service_configs.ports.private.bazarr.port;
  };

-  services.caddy.virtualHosts."bazarr.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy :${builtins.toString service_configs.ports.private.bazarr.port}
-  '';
-
  users.users.${config.services.bazarr.user}.extraGroups = [
    service_configs.media_group
  ];
--- a/services/arr/init.nix
+++ b/services/arr/init.nix
@@ -8,13 +8,26 @@
      dataDir = service_configs.prowlarr.dataDir;
      apiVersion = "v1";
      networkNamespacePath = "/run/netns/wg";
+      networkNamespaceService = "wg";
+      # Guarantee critical config.xml elements before startup. Prowlarr has a
+      # history of losing <Port> from config.xml, causing the service to run
+      # without binding any socket. See arr-init's configXml for details.
+      configXml = {
+        Port = service_configs.ports.private.prowlarr.port;
+        BindAddress = "*";
+        EnableSsl = false;
+      };
+      # Prowlarr runs in the wg netns; Sonarr/Radarr in the host netns.
+      # From host netns, Prowlarr is reachable at the wg namespace address,
+      # not at localhost (which resolves to the host's own netns).
+      # Health checks can now run — the reverse-connect is reachable.
      healthChecks = true;
      syncedApps = [
        {
          name = "Sonarr";
          implementation = "Sonarr";
          configContract = "SonarrSettings";
-          prowlarrUrl = "http://localhost:${builtins.toString service_configs.ports.private.prowlarr.port}";
+          prowlarrUrl = "http://${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.prowlarr.port}";
          baseUrl = "http://${config.vpnNamespaces.wg.bridgeAddress}:${builtins.toString service_configs.ports.private.sonarr.port}";
          apiKeyFrom = "${service_configs.sonarr.dataDir}/config.xml";
          serviceName = "sonarr";
@@ -23,7 +36,7 @@
          name = "Radarr";
          implementation = "Radarr";
          configContract = "RadarrSettings";
-          prowlarrUrl = "http://localhost:${builtins.toString service_configs.ports.private.prowlarr.port}";
+          prowlarrUrl = "http://${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.prowlarr.port}";
          baseUrl = "http://${config.vpnNamespaces.wg.bridgeAddress}:${builtins.toString service_configs.ports.private.radarr.port}";
          apiKeyFrom = "${service_configs.radarr.dataDir}/config.xml";
          serviceName = "radarr";
@@ -37,6 +50,11 @@
      port = service_configs.ports.private.sonarr.port;
      dataDir = service_configs.sonarr.dataDir;
      healthChecks = true;
+      configXml = {
+        Port = service_configs.ports.private.sonarr.port;
+        BindAddress = "*";
+        EnableSsl = false;
+      };
      rootFolders = [ service_configs.media.tvDir ];
      naming = {
        renameEpisodes = true;
@@ -69,6 +87,11 @@
      port = service_configs.ports.private.radarr.port;
      dataDir = service_configs.radarr.dataDir;
      healthChecks = true;
+      configXml = {
+        Port = service_configs.ports.private.radarr.port;
+        BindAddress = "*";
+        EnableSsl = false;
+      };
      rootFolders = [ service_configs.media.moviesDir ];
      naming = {
        renameMovies = true;
@@ -110,4 +133,21 @@
      serviceName = "radarr";
    };
  };
+
+  services.jellyseerrInit = {
+    enable = true;
+    configDir = service_configs.jellyseerr.configDir;
+    radarr = {
+      profileName = "Remux + WEB 2160p";
+      dataDir = service_configs.radarr.dataDir;
+      port = service_configs.ports.private.radarr.port;
+      serviceName = "radarr";
+    };
+    sonarr = {
+      profileName = "WEB-2160p";
+      dataDir = service_configs.sonarr.dataDir;
+      port = service_configs.ports.private.sonarr.port;
+      serviceName = "sonarr";
+    };
+  };
 }
--- a/services/arr/jellyseerr.nix
+++ b/services/arr/jellyseerr.nix
@@ -13,6 +13,10 @@
    (lib.serviceFilePerms "jellyseerr" [
      "Z ${service_configs.jellyseerr.configDir} 0700 jellyseerr jellyseerr"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "jellyseerr";
+      port = service_configs.ports.private.jellyseerr.port;
+    })
  ];

  services.jellyseerr = {
@@ -36,8 +40,4 @@

  users.groups.jellyseerr = { };

-  services.caddy.virtualHosts."jellyseerr.${service_configs.https.domain}".extraConfig = ''
-    # import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy :${builtins.toString service_configs.ports.private.jellyseerr.port}
-  '';
 }
--- a/services/arr/prowlarr.nix
+++ b/services/arr/prowlarr.nix
@@ -14,6 +14,12 @@
    (lib.serviceFilePerms "prowlarr" [
      "Z ${service_configs.prowlarr.dataDir} 0700 prowlarr prowlarr"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "prowlarr";
+      port = service_configs.ports.private.prowlarr.port;
+      auth = true;
+      vpn = true;
+    })
  ];

  services.prowlarr = {
@@ -32,6 +38,17 @@
  };
  users.groups.prowlarr = { };

+  # The upstream prowlarr module hardcodes root:root in tmpfiles for custom dataDirs
+  # (systemd.tmpfiles.settings."10-prowlarr"), which gets applied by
+  # systemd-tmpfiles-setup.service on every boot/deploy, resetting the directory
+  # ownership and making Prowlarr unable to access its SQLite databases.
+  # Override to use the correct user as we disable DynamicUser
+  systemd.tmpfiles.settings."10-prowlarr".${service_configs.prowlarr.dataDir}.d = lib.mkForce {
+    user = "prowlarr";
+    group = "prowlarr";
+    mode = "0700";
+  };
+
  systemd.services.prowlarr.serviceConfig = {
    DynamicUser = lib.mkForce false;
    User = "prowlarr";
@@ -40,8 +57,4 @@
    ExecStart = lib.mkForce "${lib.getExe pkgs.prowlarr} -nobrowser -data=${service_configs.prowlarr.dataDir}";
  };

-  services.caddy.virtualHosts."prowlarr.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy ${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.prowlarr.port}
-  '';
 }
--- a/services/arr/radarr.nix
+++ b/services/arr/radarr.nix
@@ -16,6 +16,11 @@
    (lib.serviceFilePerms "radarr" [
      "Z ${service_configs.radarr.dataDir} 0700 ${config.services.radarr.user} ${config.services.radarr.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "radarr";
+      port = service_configs.ports.private.radarr.port;
+      auth = true;
+    })
  ];

  services.radarr = {
@@ -25,11 +30,6 @@
    settings.update.mechanism = "external";
  };

-  services.caddy.virtualHosts."radarr.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy :${builtins.toString service_configs.ports.private.radarr.port}
-  '';
-
  users.users.${config.services.radarr.user}.extraGroups = [
    service_configs.media_group
  ];
--- a/services/arr/recyclarr.nix
+++ b/services/arr/recyclarr.nix
@@ -13,8 +13,8 @@ let
  # Runs as root (via + prefix) after the NixOS module writes config.json.
  # Extracts API keys from radarr/sonarr config.xml and injects them via jq.
  injectApiKeys = pkgs.writeShellScript "recyclarr-inject-api-keys" ''
-    RADARR_KEY=$(${lib.getExe pkgs.gnugrep} -oP '(?<=<ApiKey>)[^<]+' ${radarrConfig})
-    SONARR_KEY=$(${lib.getExe pkgs.gnugrep} -oP '(?<=<ApiKey>)[^<]+' ${sonarrConfig})
+    RADARR_KEY=$(${lib.extractArrApiKey radarrConfig})
+    SONARR_KEY=$(${lib.extractArrApiKey sonarrConfig})
    ${pkgs.jq}/bin/jq \
      --arg rk "$RADARR_KEY" \
      --arg sk "$SONARR_KEY" \
@@ -46,50 +46,69 @@ in
      radarr.movies = {
        base_url = "http://localhost:${builtins.toString service_configs.ports.private.radarr.port}";

+        # Recyclarr is the sole authority for custom formats and scores.
+        # Overwrite any manually-created CFs and delete stale ones.
+        replace_existing_custom_formats = true;
+        delete_old_custom_formats = true;
+
        include = [
          { template = "radarr-quality-definition-movie"; }
          { template = "radarr-quality-profile-remux-web-2160p"; }
          { template = "radarr-custom-formats-remux-web-2160p"; }
        ];

+        # Group WEB 2160p with 1080p in the same quality tier so custom
+        # format scores -- not quality ranking -- decide the winner.
+        # Native 4K with HDR/DV from good release groups scores high and
+        # wins; AI upscales get -10000 from the Upscaled CF and are
+        # blocked by min_format_score. Untagged upscales from unknown
+        # groups (score ~0) lose to well-scored 1080p (Tier 01 = +1750).
        quality_profiles = [
          {
            name = "Remux + WEB 2160p";
            min_format_score = 0;
-            reset_unmatched_scores = {
-              enabled = true;
-            };
+            reset_unmatched_scores.enabled = true;
            upgrade = {
              allowed = true;
              until_quality = "Remux-2160p";
              until_score = 10000;
            };
-            quality_sort = "top";
            qualities = [
              { name = "Remux-2160p"; }
              {
-                name = "WEB 2160p";
+                name = "WEB/Bluray";
                qualities = [
                  "WEBDL-2160p"
                  "WEBRip-2160p"
-                ];
-              }
-              { name = "Remux-1080p"; }
-              { name = "Bluray-1080p"; }
-              {
-                name = "WEB 1080p";
-                qualities = [
+                  "Remux-1080p"
+                  "Bluray-1080p"
                  "WEBDL-1080p"
                  "WEBRip-1080p"
                ];
              }
              { name = "HDTV-1080p"; }
+              { name = "Bluray-720p"; }
+              {
+                name = "WEB 720p";
+                qualities = [
+                  "WEBDL-720p"
+                  "WEBRip-720p"
+                ];
+              }
+              { name = "HDTV-720p"; }
            ];
          }
        ];

        custom_formats = [
-          # Upscaled
+          # DV (w/o HDR fallback) - block releases with DV that lack HDR10 fallback
+          {
+            trash_ids = [ "923b6abef9b17f937fab56cfcf89e1f1" ];
+            assign_scores_to = [
+              { name = "Remux + WEB 2160p"; }
+            ];
+          }
+          # Upscaled - block AI upscales and other upscaled-to-2160p releases
          {
            trash_ids = [ "bfd8eb01832d646a0a89c4deb46f8564" ];
            assign_scores_to = [
@@ -99,75 +118,74 @@ in
              }
            ];
          }
-          # x265 (HD) - override template -10000 penalty
-          {
-            trash_ids = [ "dc98083864ea246d05a42df0d05f81cc" ];
-            assign_scores_to = [
-              {
-                name = "Remux + WEB 2160p";
-                score = 0;
-              }
-            ];
-          }
-          # x265 (no HDR/DV) - override template -10000 penalty
-          {
-            trash_ids = [ "839bea857ed2c0a8e084f3cbdbd65ecb" ];
-            assign_scores_to = [
-              {
-                name = "Remux + WEB 2160p";
-                score = 0;
-              }
-            ];
-          }
        ];
      };

      sonarr.series = {
        base_url = "http://localhost:${builtins.toString service_configs.ports.private.sonarr.port}";

+        # Recyclarr is the sole authority for custom formats and scores.
+        # Overwrite any manually-created CFs and delete stale ones.
+        replace_existing_custom_formats = true;
+        delete_old_custom_formats = true;
+
        include = [
          { template = "sonarr-quality-definition-series"; }
          { template = "sonarr-v4-quality-profile-web-2160p"; }
          { template = "sonarr-v4-custom-formats-web-2160p"; }
        ];

+        # Group WEB 2160p with 1080p in the same quality tier so custom
+        # format scores -- not quality ranking -- decide the winner.
+        # Native 4K with HDR/DV from good release groups scores high and
+        # wins; AI upscales get -10000 from the Upscaled CF and are
+        # blocked by min_format_score. Untagged upscales from unknown
+        # groups (score ~0) lose to well-scored 1080p (Tier 01 = +1750).
        quality_profiles = [
          {
            name = "WEB-2160p";
            min_format_score = 0;
-            reset_unmatched_scores = {
-              enabled = true;
-            };
+            reset_unmatched_scores.enabled = true;
            upgrade = {
              allowed = true;
-              until_quality = "WEB 2160p";
+              until_quality = "WEB/Bluray";
              until_score = 10000;
            };
-            quality_sort = "top";
            qualities = [
              {
-                name = "WEB 2160p";
+                name = "WEB/Bluray";
                qualities = [
                  "WEBDL-2160p"
                  "WEBRip-2160p"
-                ];
-              }
-              { name = "Bluray-1080p Remux"; }
-              { name = "Bluray-1080p"; }
-              {
-                name = "WEB 1080p";
-                qualities = [
+                  "Bluray-1080p Remux"
+                  "Bluray-1080p"
                  "WEBDL-1080p"
                  "WEBRip-1080p"
                ];
              }
              { name = "HDTV-1080p"; }
+              { name = "Bluray-720p"; }
+              {
+                name = "WEB 720p";
+                qualities = [
+                  "WEBDL-720p"
+                  "WEBRip-720p"
+                ];
+              }
+              { name = "HDTV-720p"; }
            ];
          }
        ];

        custom_formats = [
-          # Upscaled
+          # DV (w/o HDR fallback) - block releases with DV that lack HDR10 fallback
+          {
+            trash_ids = [ "9b27ab6498ec0f31a3353992e19434ca" ];
+            assign_scores_to = [
+              { name = "WEB-2160p"; }
+            ];
+          }
+          # Upscaled - block AI upscales and other upscaled-to-2160p releases
          {
            trash_ids = [ "23297a736ca77c0fc8e70f8edd7ee56c" ];
            assign_scores_to = [
@@ -177,32 +195,23 @@ in
              }
            ];
          }
-          # x265 (HD) - override template -10000 penalty
-          {
-            trash_ids = [ "47435ece6b99a0b477caf360e79ba0bb" ];
-            assign_scores_to = [
-              {
-                name = "WEB-2160p";
-                score = 0;
-              }
-            ];
-          }
-          # x265 (no HDR/DV) - override template -10000 penalty
-          {
-            trash_ids = [ "9b64dff695c2115facf1b6ea59c9bd07" ];
-            assign_scores_to = [
-              {
-                name = "WEB-2160p";
-                score = 0;
-              }
-            ];
-          }
        ];
      };
    };
  };

-  # Add secrets generation before recyclarr runs
+  # Trigger immediate sync on deploy when recyclarr config changes.
+  # restartTriggers on the oneshot service are unreliable (systemd may
+  # no-op a restart of an inactive oneshot). Instead, embed a config
+  # hash in the timer unit -- NixOS restarts changed timers reliably,
+  # and OnActiveSec fires the sync within seconds.
+  systemd.timers.recyclarr = {
+    timerConfig.OnActiveSec = "5s";
+    unitConfig.X-ConfigHash = builtins.hashString "sha256" (
+      builtins.toJSON config.services.recyclarr.configuration
+    );
+  };
+
  systemd.services.recyclarr = {
    after = [
      "network-online.target"
--- a/services/arr/sonarr.nix
+++ b/services/arr/sonarr.nix
@@ -16,6 +16,11 @@
    (lib.serviceFilePerms "sonarr" [
      "Z ${service_configs.sonarr.dataDir} 0700 ${config.services.sonarr.user} ${config.services.sonarr.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "sonarr";
+      port = service_configs.ports.private.sonarr.port;
+      auth = true;
+    })
  ];

  systemd.tmpfiles.rules = [
@@ -31,11 +36,6 @@
    settings.update.mechanism = "external";
  };

-  services.caddy.virtualHosts."sonarr.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy :${builtins.toString service_configs.ports.private.sonarr.port}
-  '';
-
  users.users.${config.services.sonarr.user}.extraGroups = [
    service_configs.media_group
  ];
--- a/services/arr/torrent-audit.nix
+++ b/services/arr/torrent-audit.nix
@@ -2,6 +2,7 @@
  pkgs,
  config,
  service_configs,
+  lib,
  ...
 }:
 {
@@ -34,7 +35,7 @@
      RADARR_CONFIG = "${service_configs.radarr.dataDir}/config.xml";
      SONARR_URL = "http://localhost:${builtins.toString service_configs.ports.private.sonarr.port}";
      SONARR_CONFIG = "${service_configs.sonarr.dataDir}/config.xml";
-      CATEGORIES = "tvshows,movies,anime";
+      CATEGORIES = lib.concatStringsSep "," (builtins.attrNames service_configs.torrent.categories);
      TAG_TORRENTS = "true";
    };
  };
--- a/services/bitmagnet.nix
+++ b/services/bitmagnet.nix
@@ -5,9 +5,66 @@
  lib,
  ...
 }:
+let
+  prowlarrPort = toString service_configs.ports.private.prowlarr.port;
+  sonarrPort = toString service_configs.ports.private.sonarr.port;
+  radarrPort = toString service_configs.ports.private.radarr.port;
+  bitmagnetPort = toString service_configs.ports.private.bitmagnet.port;
+  bridgeAddr = config.vpnNamespaces.wg.bridgeAddress;
+
+  prowlarrConfigXml = "${service_configs.prowlarr.dataDir}/config.xml";
+  sonarrConfigXml = "${service_configs.sonarr.dataDir}/config.xml";
+  radarrConfigXml = "${service_configs.radarr.dataDir}/config.xml";
+
+  curl = "${pkgs.curl}/bin/curl";
+  jq = "${pkgs.jq}/bin/jq";
+
+  # Clears the escalating failure backoff for the Bitmagnet indexer across
+  # Prowlarr, Sonarr, and Radarr so searches resume immediately after
+  # Bitmagnet restarts instead of waiting hours for disable timers to expire.
+  recoveryScript = pkgs.writeShellScript "prowlarr-bitmagnet-recovery" ''
+    set -euo pipefail
+
+    wait_for() {
+      for _ in $(seq 1 "$2"); do
+        ${curl} -sf --max-time 5 "$1" > /dev/null && return 0
+        sleep 5
+      done
+      echo "$1 not reachable, aborting" >&2; exit 1
+    }
+
+    # Test a Bitmagnet-named indexer to clear its failure status.
+    # A successful test triggers RecordSuccess() which resets the backoff.
+    clear_status() {
+      local key indexer
+      key=$(${lib.extractArrApiKey ''"$3"''}) || return 0
+      indexer=$(${curl} -sf --max-time 10 \
+        -H "X-Api-Key: $key" "$2/api/$1/indexer" | \
+        ${jq} 'first(.[] | select(.name | test("Bitmagnet"; "i")))') || return 0
+      [ -n "$indexer" ] && [ "$indexer" != "null" ] || return 0
+      ${curl} -sf --max-time 30 \
+        -H "X-Api-Key: $key" -H "Content-Type: application/json" \
+        -X POST "$2/api/$1/indexer/test" -d "$indexer" > /dev/null
+    }
+
+    wait_for "http://localhost:${bitmagnetPort}" 12
+    wait_for "http://localhost:${prowlarrPort}/ping" 6
+
+    # Prowlarr first — downstream apps route searches through it.
+    clear_status v1 "http://localhost:${prowlarrPort}" "${prowlarrConfigXml}" || true
+    clear_status v3 "http://${bridgeAddr}:${sonarrPort}" "${sonarrConfigXml}" || true
+    clear_status v3 "http://${bridgeAddr}:${radarrPort}" "${radarrConfigXml}" || true
+  '';
+in
 {
  imports = [
    (lib.vpnNamespaceOpenPort service_configs.ports.private.bitmagnet.port "bitmagnet")
+    (lib.mkCaddyReverseProxy {
+      subdomain = "bitmagnet";
+      port = service_configs.ports.private.bitmagnet.port;
+      auth = true;
+      vpn = true;
+    })
  ];

  services.bitmagnet = {
@@ -19,13 +76,38 @@
      };
      http_server = {
        # TODO! make issue about this being a string and not a `port` type
-        port = ":" + (builtins.toString service_configs.ports.private.bitmagnet.port);
+        port = ":" + (toString service_configs.ports.private.bitmagnet.port);
      };
    };
  };

-  services.caddy.virtualHosts."bitmagnet.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy ${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.bitmagnet.port}
-  '';
+  # The upstream default (Restart=on-failure) leaves Bitmagnet dead after
+  # clean exits (e.g. systemd stop during deploy). Always restart it.
+  systemd.services.bitmagnet.serviceConfig = {
+    Restart = lib.mkForce "always";
+    RestartSec = 10;
+  };
+
+  # After Bitmagnet restarts, clear the escalating failure backoff across
+  # Prowlarr, Sonarr, and Radarr so searches resume immediately instead of
+  # waiting hours for the disable timers to expire.
+  systemd.services.prowlarr-bitmagnet-recovery = {
+    description = "Clear Prowlarr/Sonarr/Radarr failure status for Bitmagnet indexer";
+    after = [
+      "bitmagnet.service"
+      "prowlarr.service"
+      "sonarr.service"
+      "radarr.service"
+    ];
+    bindsTo = [ "bitmagnet.service" ];
+    wantedBy = [ "bitmagnet.service" ];
+
+    serviceConfig = {
+      Type = "oneshot";
+      RemainAfterExit = true;
+      ExecStart = recoveryScript;
+      # Same VPN namespace as Bitmagnet and Prowlarr.
+      NetworkNamespacePath = "/run/netns/wg";
+    };
+  };
 }
--- a/services/bitwarden.nix
+++ b/services/bitwarden.nix
@@ -13,6 +13,10 @@
    (lib.serviceFilePerms "vaultwarden" [
      "Z ${service_configs.vaultwarden.path} 0700 vaultwarden vaultwarden"
    ])
+    (lib.mkFail2banJail {
+      name = "vaultwarden";
+      failregex = ''^.*Username or password is incorrect\. Try again\. IP: <HOST>\..*$'';
+    })
  ];

  services.vaultwarden = {
@@ -38,18 +42,4 @@
    }
  '';

-  # Protect Vaultwarden login from brute force attacks
-  services.fail2ban.jails.vaultwarden = {
-    enabled = true;
-    settings = {
-      backend = "systemd";
-      port = "http,https";
-      # defaults: maxretry=5, findtime=10m, bantime=10m
-    };
-    filter.Definition = {
-      failregex = ''^.*Username or password is incorrect\. Try again\. IP: <HOST>\..*$'';
-      ignoreregex = "";
-      journalmatch = "_SYSTEMD_UNIT=vaultwarden.service";
-    };
-  };
 }
--- a/services/caddy/caddy.nix
+++ b/services/caddy/caddy.nix
@@ -56,9 +56,19 @@ in
    enable = true;
    email = "titaniumtown@proton.me";

-    # Enable on-demand TLS for old domain redirects
-    # Certs are issued dynamically when subdomains are accessed
+    # Build with Njalla DNS provider for DNS-01 ACME challenges (wildcard certs)
+    package = pkgs.caddy.withPlugins {
+      plugins = [ "github.com/caddy-dns/njalla@v0.0.0-20250823094507-f709141f1fe6" ];
+      hash = "sha256-rrOAR6noTDpV/I/hZXxhz0OXVJKu0mFQRq87RUrpmzw=";
+    };
+
    globalConfig = ''
+      # Wildcard cert for *.${newDomain} via DNS-01 challenge
+      acme_dns njalla {
+        api_token {env.NJALLA_API_TOKEN}
+      }
+
+      # On-demand TLS for old domain redirects
      on_demand_tls {
        ask http://localhost:9123/check
      }
@@ -106,6 +116,9 @@ in
    };
  };

+  # Inject Njalla API token for DNS-01 challenge
+  systemd.services.caddy.serviceConfig.EnvironmentFile = config.age.secrets.njalla-api-token-env.path;
+
  systemd.tmpfiles.rules = [
    "d ${config.services.caddy.dataDir} 700 ${config.services.caddy.user} ${config.services.caddy.group}"
  ];
--- a/services/caddy/caddy_senior_project.nix
+++ b/services/caddy/caddy_senior_project.nix
--- a/services/caddy/default.nix
+++ b/services/caddy/default.nix
@@ -0,0 +1,7 @@
+{
+  imports = [
+    ./caddy.nix
+    # KEEP UNTIL 2028
+    ./caddy_senior_project.nix
+  ];
+}
--- a/services/ddns-updater.nix
+++ b/services/ddns-updater.nix
@@ -0,0 +1,27 @@
+{
+  config,
+  lib,
+  ...
+}:
+{
+  services.ddns-updater = {
+    enable = true;
+    environment = {
+      PERIOD = "5m";
+      # ddns-updater reads config from this path at runtime
+      CONFIG_FILEPATH = config.age.secrets.ddns-updater-config.path;
+    };
+  };
+
+  users.users.ddns-updater = {
+    isSystemUser = true;
+    group = "ddns-updater";
+  };
+  users.groups.ddns-updater = { };
+
+  systemd.services.ddns-updater.serviceConfig = {
+    DynamicUser = lib.mkForce false;
+    User = "ddns-updater";
+    Group = "ddns-updater";
+  };
+}
--- a/services/firefox-syncserver.nix
+++ b/services/firefox-syncserver.nix
@@ -6,6 +6,13 @@
  ...
 }:
 {
+  imports = [
+    (lib.mkCaddyReverseProxy {
+      domain = service_configs.firefox_syncserver.domain;
+      port = service_configs.ports.private.firefox_syncserver.port;
+    })
+  ];
+
  services.firefox-syncserver = {
    enable = true;
    database = {
@@ -33,7 +40,4 @@
    ];
  };

-  services.caddy.virtualHosts."${service_configs.firefox_syncserver.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString service_configs.ports.private.firefox_syncserver.port}
-  '';
 }
--- a/services/gitea-actions-runner.nix
+++ b/services/gitea-actions-runner.nix
@@ -0,0 +1,50 @@
+{
+  config,
+  lib,
+  pkgs,
+  service_configs,
+  ...
+}:
+{
+  services.gitea-actions-runner.instances.muffin = {
+    enable = true;
+    name = "muffin";
+    url = config.services.gitea.settings.server.ROOT_URL;
+    tokenFile = config.age.secrets.gitea-runner-token.path;
+    labels = [ "nix:host" ];
+    hostPackages = with pkgs; [
+      bash
+      coreutils
+      curl
+      gawk
+      git
+      git-crypt
+      gnugrep
+      gnused
+      jq
+      nix
+      nodejs
+      openssh
+    ];
+    settings = {
+      runner = {
+        capacity = 1;
+        timeout = "6h";
+      };
+    };
+  };
+
+  # Override DynamicUser to use our static gitea-runner user, and ensure
+  # the runner doesn't start before the co-located gitea instance is ready
+  # (upstream can't assume locality, so this dependency is ours to add).
+  systemd.services."gitea-runner-muffin" = {
+    requires = [ "gitea.service" ];
+    after = [ "gitea.service" ];
+    serviceConfig = {
+      DynamicUser = lib.mkForce false;
+      User = "gitea-runner";
+      Group = "gitea-runner";
+    };
+    environment.GIT_SSH_COMMAND = "ssh -i /run/agenix/ci-deploy-key -o StrictHostKeyChecking=yes -o UserKnownHostsFile=/etc/ci-known-hosts";
+  };
+}
--- a/services/gitea.nix
+++ b/services/gitea.nix
@@ -11,6 +11,14 @@
    (lib.serviceFilePerms "gitea" [
      "Z ${config.services.gitea.stateDir} 0700 ${config.services.gitea.user} ${config.services.gitea.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      domain = service_configs.gitea.domain;
+      port = service_configs.ports.private.gitea.port;
+    })
+    (lib.mkFail2banJail {
+      name = "gitea";
+      failregex = "^.*Failed authentication attempt for .* from <HOST>:.*$";
+    })
  ];

  services.gitea = {
@@ -37,13 +45,10 @@
      };
      # only I shall use gitea
      service.DISABLE_REGISTRATION = true;
+      actions.ENABLED = true;
    };
  };

-  services.caddy.virtualHosts."${service_configs.gitea.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString config.services.gitea.settings.server.HTTP_PORT}
-  '';
-
  services.postgresql = {
    ensureDatabases = [ config.services.gitea.user ];
    ensureUsers = [
@@ -57,18 +62,4 @@

  services.openssh.settings.AllowUsers = [ config.services.gitea.user ];

-  # Protect Gitea login from brute force attacks
-  services.fail2ban.jails.gitea = {
-    enabled = true;
-    settings = {
-      backend = "systemd";
-      port = "http,https";
-      # defaults: maxretry=5, findtime=10m, bantime=10m
-    };
-    filter.Definition = {
-      failregex = "^.*Failed authentication attempt for .* from <HOST>:.*$";
-      ignoreregex = "";
-      journalmatch = "_SYSTEMD_UNIT=gitea.service";
-    };
-  };
 }
--- a/services/grafana/dashboard.nix
+++ b/services/grafana/dashboard.nix
@@ -0,0 +1,698 @@
+{
+  ...
+}:
+let
+  promDs = {
+    type = "prometheus";
+    uid = "prometheus";
+  };
+
+  dashboard = {
+    editable = true;
+    graphTooltip = 1;
+    schemaVersion = 39;
+    tags = [
+      "system"
+      "monitoring"
+    ];
+    time = {
+      from = "now-6h";
+      to = "now";
+    };
+    timezone = "browser";
+    title = "System Overview";
+    uid = "system-overview";
+
+    annotations.list = [
+      {
+        name = "Jellyfin Streams";
+        datasource = {
+          type = "grafana";
+          uid = "-- Grafana --";
+        };
+        enable = true;
+        iconColor = "green";
+        showIn = 0;
+        type = "tags";
+        tags = [ "jellyfin" ];
+      }
+      {
+        name = "ZFS Scrubs";
+        datasource = {
+          type = "grafana";
+          uid = "-- Grafana --";
+        };
+        enable = true;
+        iconColor = "orange";
+        showIn = 0;
+        type = "tags";
+        tags = [ "zfs-scrub" ];
+      }
+      {
+        name = "LLM Requests";
+        datasource = promDs;
+        enable = true;
+        iconColor = "purple";
+        target = {
+          datasource = promDs;
+          expr = "llamacpp:requests_processing > 0";
+          instant = false;
+          range = true;
+          refId = "A";
+        };
+        titleFormat = "LLM inference";
+      }
+    ];
+
+    panels = [
+      # -- Row 1: UPS --
+      {
+        id = 1;
+        type = "timeseries";
+        title = "UPS Power Draw";
+        gridPos = {
+          h = 8;
+          w = 8;
+          x = 0;
+          y = 0;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "apcupsd_ups_load_percent / 100 * apcupsd_nominal_power_watts";
+            legendFormat = "Power (W)";
+            refId = "A";
+          }
+          {
+            datasource = promDs;
+            expr = "avg_over_time((apcupsd_ups_load_percent / 100 * apcupsd_nominal_power_watts + 4.5)[5m:])";
+            legendFormat = "5m average (W)";
+            refId = "B";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "watt";
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 2;
+              fillOpacity = 20;
+              spanNulls = true;
+            };
+          };
+          overrides = [
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "A";
+              };
+              properties = [
+                {
+                  id = "custom.lineStyle";
+                  value = {
+                    fill = "dot";
+                  };
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 10;
+                }
+                {
+                  id = "custom.lineWidth";
+                  value = 1;
+                }
+                {
+                  id = "custom.pointSize";
+                  value = 1;
+                }
+              ];
+            }
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "B";
+              };
+              properties = [
+                {
+                  id = "custom.lineWidth";
+                  value = 4;
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 0;
+                }
+              ];
+            }
+          ];
+        };
+      }
+      {
+        id = 7;
+        type = "stat";
+        title = "Energy Usage (24h)";
+        gridPos = {
+          h = 8;
+          w = 4;
+          x = 8;
+          y = 0;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "avg_over_time((apcupsd_ups_load_percent / 100 * apcupsd_nominal_power_watts + 4.5)[24h:]) * 24 / 1000";
+            legendFormat = "";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "kwatth";
+            decimals = 2;
+            thresholds = {
+              mode = "absolute";
+              steps = [
+                {
+                  color = "green";
+                  value = null;
+                }
+                {
+                  color = "yellow";
+                  value = 5;
+                }
+                {
+                  color = "red";
+                  value = 10;
+                }
+              ];
+            };
+          };
+          overrides = [ ];
+        };
+        options = {
+          reduceOptions = {
+            calcs = [ "lastNotNull" ];
+            fields = "";
+            values = false;
+          };
+          colorMode = "value";
+          graphMode = "none";
+        };
+      }
+      {
+        id = 2;
+        type = "gauge";
+        title = "UPS Load";
+        gridPos = {
+          h = 8;
+          w = 6;
+          x = 12;
+          y = 0;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "apcupsd_ups_load_percent";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "percent";
+            min = 0;
+            max = 100;
+            thresholds = {
+              mode = "absolute";
+              steps = [
+                {
+                  color = "green";
+                  value = null;
+                }
+                {
+                  color = "yellow";
+                  value = 70;
+                }
+                {
+                  color = "red";
+                  value = 90;
+                }
+              ];
+            };
+          };
+          overrides = [ ];
+        };
+        options.reduceOptions = {
+          calcs = [ "lastNotNull" ];
+          fields = "";
+          values = false;
+        };
+      }
+      {
+        id = 3;
+        type = "gauge";
+        title = "UPS Battery";
+        gridPos = {
+          h = 8;
+          w = 6;
+          x = 18;
+          y = 0;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "apcupsd_battery_charge_percent";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "percent";
+            min = 0;
+            max = 100;
+            thresholds = {
+              mode = "absolute";
+              steps = [
+                {
+                  color = "red";
+                  value = null;
+                }
+                {
+                  color = "yellow";
+                  value = 20;
+                }
+                {
+                  color = "green";
+                  value = 50;
+                }
+              ];
+            };
+          };
+          overrides = [ ];
+        };
+        options.reduceOptions = {
+          calcs = [ "lastNotNull" ];
+          fields = "";
+          values = false;
+        };
+      }
+
+      # -- Row 2: System --
+      {
+        id = 4;
+        type = "timeseries";
+        title = "CPU Temperature";
+        gridPos = {
+          h = 8;
+          w = 12;
+          x = 0;
+          y = 8;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = ''node_hwmon_temp_celsius{chip=~"pci.*"}'';
+            legendFormat = "CPU {{sensor}}";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "celsius";
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 2;
+              fillOpacity = 10;
+              spanNulls = true;
+            };
+          };
+          overrides = [ ];
+        };
+      }
+      {
+        id = 5;
+        type = "stat";
+        title = "System Uptime";
+        gridPos = {
+          h = 8;
+          w = 6;
+          x = 12;
+          y = 8;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "time() - node_boot_time_seconds";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "s";
+            thresholds = {
+              mode = "absolute";
+              steps = [
+                {
+                  color = "green";
+                  value = null;
+                }
+              ];
+            };
+          };
+          overrides = [ ];
+        };
+        options = {
+          reduceOptions = {
+            calcs = [ "lastNotNull" ];
+            fields = "";
+            values = false;
+          };
+          colorMode = "value";
+          graphMode = "none";
+        };
+      }
+      {
+        id = 6;
+        type = "stat";
+        title = "Jellyfin Active Streams";
+        gridPos = {
+          h = 8;
+          w = 6;
+          x = 18;
+          y = 8;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "count(jellyfin_now_playing_state) or vector(0)";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            thresholds = {
+              mode = "absolute";
+              steps = [
+                {
+                  color = "green";
+                  value = null;
+                }
+                {
+                  color = "yellow";
+                  value = 3;
+                }
+                {
+                  color = "red";
+                  value = 6;
+                }
+              ];
+            };
+          };
+          overrides = [ ];
+        };
+        options = {
+          reduceOptions = {
+            calcs = [ "lastNotNull" ];
+            fields = "";
+            values = false;
+          };
+          colorMode = "value";
+          graphMode = "area";
+        };
+      }
+
+      # -- Row 3: qBittorrent --
+      {
+        id = 11;
+        type = "timeseries";
+        title = "qBittorrent Speed";
+        gridPos = {
+          h = 8;
+          w = 24;
+          x = 0;
+          y = 16;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "sum(qbit_dlspeed) or vector(0)";
+            legendFormat = "Download";
+            refId = "A";
+          }
+          {
+            datasource = promDs;
+            expr = "sum(qbit_upspeed) or vector(0)";
+            legendFormat = "Upload";
+            refId = "B";
+          }
+          {
+            datasource = promDs;
+            expr = "avg_over_time((sum(qbit_dlspeed) or vector(0))[10m:])";
+            legendFormat = "Download (10m avg)";
+            refId = "C";
+          }
+          {
+            datasource = promDs;
+            expr = "avg_over_time((sum(qbit_upspeed) or vector(0))[10m:])";
+            legendFormat = "Upload (10m avg)";
+            refId = "D";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "binBps";
+            min = 0;
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 1;
+              fillOpacity = 10;
+              spanNulls = true;
+            };
+          };
+          overrides = [
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "A";
+              };
+              properties = [
+                {
+                  id = "color";
+                  value = {
+                    fixedColor = "green";
+                    mode = "fixed";
+                  };
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 5;
+                }
+              ];
+            }
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "B";
+              };
+              properties = [
+                {
+                  id = "color";
+                  value = {
+                    fixedColor = "blue";
+                    mode = "fixed";
+                  };
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 5;
+                }
+              ];
+            }
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "C";
+              };
+              properties = [
+                {
+                  id = "color";
+                  value = {
+                    fixedColor = "green";
+                    mode = "fixed";
+                  };
+                }
+                {
+                  id = "custom.lineWidth";
+                  value = 3;
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 0;
+                }
+              ];
+            }
+            {
+              matcher = {
+                id = "byFrameRefID";
+                options = "D";
+              };
+              properties = [
+                {
+                  id = "color";
+                  value = {
+                    fixedColor = "blue";
+                    mode = "fixed";
+                  };
+                }
+                {
+                  id = "custom.lineWidth";
+                  value = 3;
+                }
+                {
+                  id = "custom.fillOpacity";
+                  value = 0;
+                }
+              ];
+            }
+          ];
+        };
+      }
+
+      # -- Row 4: Intel GPU --
+      {
+        id = 8;
+        type = "timeseries";
+        title = "Intel GPU Utilization";
+        gridPos = {
+          h = 8;
+          w = 24;
+          x = 0;
+          y = 24;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "igpu_engines_busy_percent";
+            legendFormat = "{{engine}}";
+            refId = "A";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "percent";
+            min = 0;
+            max = 100;
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 2;
+              fillOpacity = 10;
+              spanNulls = true;
+            };
+          };
+          overrides = [ ];
+        };
+      }
+
+      # -- Row 5: Storage --
+      {
+        id = 12;
+        type = "timeseries";
+        title = "ZFS Pool Utilization";
+        gridPos = {
+          h = 8;
+          w = 12;
+          x = 0;
+          y = 32;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "zfs_pool_allocated_bytes{pool=\"tank\"} / zfs_pool_size_bytes{pool=\"tank\"} * 100";
+            legendFormat = "tank";
+            refId = "A";
+          }
+          {
+            datasource = promDs;
+            expr = "zfs_pool_allocated_bytes{pool=\"hdds\"} / zfs_pool_size_bytes{pool=\"hdds\"} * 100";
+            legendFormat = "hdds";
+            refId = "B";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "percent";
+            min = 0;
+            max = 100;
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 2;
+              fillOpacity = 20;
+              spanNulls = true;
+            };
+          };
+          overrides = [ ];
+        };
+      }
+      {
+        id = 13;
+        type = "timeseries";
+        title = "Boot Drive Partitions";
+        gridPos = {
+          h = 8;
+          w = 12;
+          x = 12;
+          y = 32;
+        };
+        datasource = promDs;
+        targets = [
+          {
+            datasource = promDs;
+            expr = "(node_filesystem_size_bytes{mountpoint=\"/boot\"} - node_filesystem_avail_bytes{mountpoint=\"/boot\"}) / node_filesystem_size_bytes{mountpoint=\"/boot\"} * 100";
+            legendFormat = "/boot";
+            refId = "A";
+          }
+          {
+            datasource = promDs;
+            expr = "(node_filesystem_size_bytes{mountpoint=\"/persistent\"} - node_filesystem_avail_bytes{mountpoint=\"/persistent\"}) / node_filesystem_size_bytes{mountpoint=\"/persistent\"} * 100";
+            legendFormat = "/persistent";
+            refId = "B";
+          }
+          {
+            datasource = promDs;
+            expr = "(node_filesystem_size_bytes{mountpoint=\"/nix\"} - node_filesystem_avail_bytes{mountpoint=\"/nix\"}) / node_filesystem_size_bytes{mountpoint=\"/nix\"} * 100";
+            legendFormat = "/nix";
+            refId = "C";
+          }
+        ];
+        fieldConfig = {
+          defaults = {
+            unit = "percent";
+            min = 0;
+            max = 100;
+            color.mode = "palette-classic";
+            custom = {
+              lineWidth = 2;
+              fillOpacity = 20;
+              spanNulls = true;
+            };
+          };
+          overrides = [ ];
+        };
+      }
+    ];
+  };
+in
+{
+  environment.etc."grafana-dashboards/system-overview.json" = {
+    text = builtins.toJSON dashboard;
+    mode = "0444";
+  };
+}
--- a/services/grafana/default.nix
+++ b/services/grafana/default.nix
@@ -0,0 +1,10 @@
+{
+  imports = [
+    ./grafana.nix
+    ./prometheus.nix
+    ./dashboard.nix
+    ./exporters.nix
+    ./jellyfin-annotations.nix
+    ./zfs-scrub-annotations.nix
+  ];
+}
--- a/services/grafana/exporters.nix
+++ b/services/grafana/exporters.nix
@@ -0,0 +1,112 @@
+{
+  config,
+  pkgs,
+  inputs,
+  service_configs,
+  lib,
+  ...
+}:
+let
+  jellyfinExporterPort = service_configs.ports.private.jellyfin_exporter.port;
+  qbitExporterPort = service_configs.ports.private.qbittorrent_exporter.port;
+  igpuExporterPort = service_configs.ports.private.igpu_exporter.port;
+in
+{
+  # -- Jellyfin Prometheus Exporter --
+  # Replaces custom jellyfin-collector.nix textfile timer.
+  # Exposes per-session metrics (jellyfin_now_playing_state) and library stats.
+  systemd.services.jellyfin-exporter =
+    lib.mkIf (config.services.grafana.enable && config.services.jellyfin.enable)
+      {
+        description = "Prometheus exporter for Jellyfin";
+        after = [
+          "network.target"
+          "jellyfin.service"
+        ];
+        wantedBy = [ "multi-user.target" ];
+        serviceConfig = {
+          ExecStart = lib.getExe (
+            pkgs.writeShellApplication {
+              name = "jellyfin-exporter-wrapper";
+              runtimeInputs = [ pkgs.jellyfin-exporter ];
+              text = ''
+                exec jellyfin_exporter \
+                  --jellyfin.address=http://127.0.0.1:${toString service_configs.ports.private.jellyfin.port} \
+                  --jellyfin.token="$(cat "$CREDENTIALS_DIRECTORY/jellyfin-api-key")" \
+                  --web.listen-address=127.0.0.1:${toString jellyfinExporterPort}
+              '';
+            }
+          );
+          Restart = "on-failure";
+          RestartSec = "10s";
+          DynamicUser = true;
+          NoNewPrivileges = true;
+          ProtectSystem = "strict";
+          ProtectHome = true;
+          PrivateTmp = true;
+          MemoryDenyWriteExecute = true;
+          LoadCredential = "jellyfin-api-key:${config.age.secrets.jellyfin-api-key.path}";
+        };
+      };
+
+  # -- qBittorrent Prometheus Exporter --
+  # Replaces custom qbittorrent-collector.nix textfile timer.
+  # Exposes per-torrent metrics (qbit_dlspeed, qbit_upspeed) and aggregate stats.
+  # qBittorrent runs in a VPN namespace; the exporter reaches it via namespace address.
+  systemd.services.qbittorrent-exporter =
+    lib.mkIf (config.services.grafana.enable && config.services.qbittorrent.enable)
+      {
+        description = "Prometheus exporter for qBittorrent";
+        after = [
+          "network.target"
+          "qbittorrent.service"
+        ];
+        wantedBy = [ "multi-user.target" ];
+        serviceConfig = {
+          ExecStart =
+            lib.getExe' inputs.qbittorrent-metrics-exporter.packages.${pkgs.system}.default
+              "qbittorrent-metrics-exporter";
+          Restart = "on-failure";
+          RestartSec = "10s";
+          DynamicUser = true;
+          NoNewPrivileges = true;
+          ProtectSystem = "strict";
+          ProtectHome = true;
+          PrivateTmp = true;
+        };
+        environment = {
+          HOST = "127.0.0.1";
+          PORT = toString qbitExporterPort;
+          SCRAPE_INTERVAL = "15";
+          BACKEND = "in-memory";
+          # qBittorrent has AuthSubnetWhitelist=0.0.0.0/0, so no real password needed.
+          # The exporter still expects the env var to be set.
+          QBITTORRENT_PASSWORD = "unused";
+          QBITTORRENT_USERNAME = "admin";
+          TORRENT_HOSTS = "qbit:main=http://${config.vpnNamespaces.wg.namespaceAddress}:${toString config.services.qbittorrent.webuiPort}|http://${config.vpnNamespaces.wg.namespaceAddress}:${toString config.services.qbittorrent.webuiPort}";
+          RUST_LOG = "warn";
+        };
+      };
+
+  # -- Intel GPU Prometheus Exporter --
+  # Replaces custom intel-gpu-collector.nix + intel-gpu-collector.py textfile timer.
+  # Exposes engine busy%, frequency, and RC6 metrics via /metrics.
+  # Requires privileged access to GPU debug interfaces (intel_gpu_top).
+  systemd.services.igpu-exporter = lib.mkIf config.services.grafana.enable {
+    description = "Prometheus exporter for Intel integrated GPU";
+    wantedBy = [ "multi-user.target" ];
+    path = [ pkgs.intel-gpu-tools ];
+    serviceConfig = {
+      ExecStart = lib.getExe pkgs.igpu-exporter;
+      Restart = "on-failure";
+      RestartSec = "10s";
+      # intel_gpu_top requires root-level access to GPU debug interfaces
+      ProtectHome = true;
+      PrivateTmp = true;
+    };
+    environment = {
+      PORT = toString igpuExporterPort;
+      REFRESH_PERIOD_MS = "30000";
+    };
+  };
+}
--- a/services/grafana/grafana.nix
+++ b/services/grafana/grafana.nix
@@ -0,0 +1,103 @@
+{
+  config,
+  service_configs,
+  lib,
+  ...
+}:
+{
+  imports = [
+    (lib.serviceMountWithZpool "grafana" service_configs.zpool_ssds [
+      service_configs.grafana.dir
+    ])
+    (lib.serviceFilePerms "grafana" [
+      "Z ${service_configs.grafana.dir} 0700 grafana grafana"
+    ])
+    (lib.mkCaddyReverseProxy {
+      domain = service_configs.grafana.domain;
+      port = service_configs.ports.private.grafana.port;
+      auth = true;
+    })
+  ];
+
+  services.grafana = {
+    enable = true;
+    dataDir = service_configs.grafana.dir;
+
+    settings = {
+      server = {
+        http_addr = "127.0.0.1";
+        http_port = service_configs.ports.private.grafana.port;
+        domain = service_configs.grafana.domain;
+        root_url = "https://${service_configs.grafana.domain}";
+      };
+
+      database = {
+        type = "postgres";
+        host = service_configs.postgres.socket;
+        user = "grafana";
+      };
+
+      "auth.anonymous" = {
+        enabled = true;
+        org_role = "Admin";
+      };
+      "auth.basic".enabled = false;
+      "auth".disable_login_form = true;
+
+      analytics.reporting_enabled = false;
+
+      feature_toggles.enable = "dataConnectionsConsole=false";
+
+      users.default_theme = "dark";
+
+      # Disable unused built-in integrations
+      alerting.enabled = false;
+      "unified_alerting".enabled = false;
+      explore.enabled = false;
+      news.news_feed_enabled = false;
+
+      plugins = {
+        enable_alpha = false;
+        plugin_admin_enabled = false;
+      };
+    };
+
+    provision = {
+      datasources.settings = {
+        apiVersion = 1;
+        datasources = [
+          {
+            name = "Prometheus";
+            type = "prometheus";
+            url = "http://127.0.0.1:${toString service_configs.ports.private.prometheus.port}";
+            access = "proxy";
+            isDefault = true;
+            editable = false;
+            uid = "prometheus";
+          }
+        ];
+      };
+
+      dashboards.settings.providers = [
+        {
+          name = "system";
+          type = "file";
+          options.path = "/etc/grafana-dashboards";
+          disableDeletion = true;
+          updateIntervalSeconds = 60;
+        }
+      ];
+    };
+  };
+
+  services.postgresql = {
+    ensureDatabases = [ "grafana" ];
+    ensureUsers = [
+      {
+        name = "grafana";
+        ensureDBOwnership = true;
+        ensureClauses.login = true;
+      }
+    ];
+  };
+}
--- a/services/grafana/jellyfin-annotations.nix
+++ b/services/grafana/jellyfin-annotations.nix
@@ -0,0 +1,18 @@
+{
+  config,
+  service_configs,
+  lib,
+  ...
+}:
+lib.mkIf (config.services.grafana.enable && config.services.jellyfin.enable) (
+  lib.mkGrafanaAnnotationService {
+    name = "jellyfin";
+    description = "Jellyfin stream annotation service for Grafana";
+    script = ./jellyfin-annotations.py;
+    environment = {
+      JELLYFIN_URL = "http://127.0.0.1:${toString service_configs.ports.private.jellyfin.port}";
+      POLL_INTERVAL = "30";
+    };
+    loadCredential = "jellyfin-api-key:${config.age.secrets.jellyfin-api-key.path}";
+  }
+)
--- a/services/grafana/jellyfin-annotations.py
+++ b/services/grafana/jellyfin-annotations.py
@@ -0,0 +1,233 @@
+#!/usr/bin/env python3
+import json
+import os
+import sys
+import time
+import urllib.request
+from pathlib import Path
+
+JELLYFIN_URL = os.environ.get("JELLYFIN_URL", "http://127.0.0.1:8096")
+GRAFANA_URL = os.environ.get("GRAFANA_URL", "http://127.0.0.1:3000")
+STATE_FILE = os.environ.get("STATE_FILE", "/var/lib/jellyfin-annotations/state.json")
+POLL_INTERVAL = int(os.environ.get("POLL_INTERVAL", "30"))
+
+
+def get_api_key():
+    cred_dir = os.environ.get("CREDENTIALS_DIRECTORY")
+    if cred_dir:
+        return Path(cred_dir, "jellyfin-api-key").read_text().strip()
+    for p in ["/run/agenix/jellyfin-api-key"]:
+        if Path(p).exists():
+            return Path(p).read_text().strip()
+    sys.exit("ERROR: Cannot find jellyfin-api-key")
+
+
+def http_json(method, url, body=None):
+    data = json.dumps(body).encode() if body is not None else None
+    req = urllib.request.Request(
+        url,
+        data=data,
+        headers={"Content-Type": "application/json", "Accept": "application/json"},
+        method=method,
+    )
+    with urllib.request.urlopen(req, timeout=5) as resp:
+        return json.loads(resp.read())
+
+
+def get_active_sessions(api_key):
+    try:
+        req = urllib.request.Request(
+            f"{JELLYFIN_URL}/Sessions?api_key={api_key}",
+            headers={"Accept": "application/json"},
+        )
+        with urllib.request.urlopen(req, timeout=5) as resp:
+            sessions = json.loads(resp.read())
+        return [s for s in sessions if s.get("NowPlayingItem")]
+    except Exception as e:
+        print(f"Error fetching sessions: {e}", file=sys.stderr)
+        return None
+
+
+def _codec(name):
+    if not name:
+        return ""
+    aliases = {"h264": "H.264", "h265": "H.265", "hevc": "H.265", "av1": "AV1",
+               "vp9": "VP9", "vp8": "VP8", "mpeg4": "MPEG-4", "mpeg2video": "MPEG-2",
+               "aac": "AAC", "ac3": "AC3", "eac3": "EAC3", "dts": "DTS",
+               "truehd": "TrueHD", "mp3": "MP3", "opus": "Opus", "flac": "FLAC",
+               "vorbis": "Vorbis"}
+    return aliases.get(name.lower(), name.upper())
+
+
+def _res(width, height):
+    if not height:
+        return ""
+    common = {2160: "4K", 1440: "1440p", 1080: "1080p", 720: "720p",
+              480: "480p", 360: "360p"}
+    return common.get(height, f"{height}p")
+
+
+def _channels(n):
+    labels = {1: "Mono", 2: "Stereo", 6: "5.1", 7: "6.1", 8: "7.1"}
+    return labels.get(n, f"{n}ch") if n else ""
+
+
+def format_label(session):
+    user = session.get("UserName", "Unknown")
+    item = session.get("NowPlayingItem", {}) or {}
+    transcode = session.get("TranscodingInfo") or {}
+    play_state = session.get("PlayState") or {}
+    client = session.get("Client", "")
+    device = session.get("DeviceName", "")
+
+    name = item.get("Name", "Unknown")
+    series = item.get("SeriesName", "")
+    season = item.get("ParentIndexNumber")
+    episode = item.get("IndexNumber")
+    media_type = item.get("Type", "")
+
+    if series and season and episode:
+        title = f"{series} S{season:02d}E{episode:02d} \u2013 {name}"
+    elif series:
+        title = f"{series} \u2013 {name}"
+    elif media_type == "Movie":
+        title = f"{name} (movie)"
+    else:
+        title = name
+
+    play_method = play_state.get("PlayMethod", "")
+    if play_method == "DirectPlay":
+        method = "Direct Play"
+    elif play_method == "DirectStream":
+        method = "Direct Stream"
+    elif play_method == "Transcode" or transcode:
+        method = "Transcode"
+    else:
+        method = "Direct Play"
+
+    media_streams = item.get("MediaStreams") or []
+    video_streams = [s for s in media_streams if s.get("Type") == "Video"]
+    audio_streams = [s for s in media_streams if s.get("Type") == "Audio"]
+    default_audio = next((s for s in audio_streams if s.get("IsDefault")), None)
+    audio_stream = default_audio or (audio_streams[0] if audio_streams else {})
+    video_stream = video_streams[0] if video_streams else {}
+
+    src_vcodec = _codec(video_stream.get("Codec", ""))
+    src_res = _res(video_stream.get("Width") or item.get("Width"),
+                   video_stream.get("Height") or item.get("Height"))
+    src_acodec = _codec(audio_stream.get("Codec", ""))
+    src_channels = _channels(audio_stream.get("Channels"))
+
+    is_video_direct = transcode.get("IsVideoDirect", True)
+    is_audio_direct = transcode.get("IsAudioDirect", True)
+
+    if transcode and not is_video_direct:
+        dst_vcodec = _codec(transcode.get("VideoCodec", ""))
+        dst_res = _res(transcode.get("Width"), transcode.get("Height")) or src_res
+        if src_vcodec and dst_vcodec and src_vcodec != dst_vcodec:
+            video_part = f"{src_vcodec}\u2192{dst_vcodec} {dst_res}".strip()
+        else:
+            video_part = f"{dst_vcodec or src_vcodec} {dst_res}".strip()
+    else:
+        video_part = f"{src_vcodec} {src_res}".strip()
+
+    if transcode and not is_audio_direct:
+        dst_acodec = _codec(transcode.get("AudioCodec", ""))
+        dst_channels = _channels(transcode.get("AudioChannels")) or src_channels
+        if src_acodec and dst_acodec and src_acodec != dst_acodec:
+            audio_part = f"{src_acodec}\u2192{dst_acodec} {dst_channels}".strip()
+        else:
+            audio_part = f"{dst_acodec or src_acodec} {dst_channels}".strip()
+    else:
+        audio_part = f"{src_acodec} {src_channels}".strip()
+
+    bitrate = transcode.get("Bitrate") or item.get("Bitrate")
+    bitrate_part = f"{bitrate / 1_000_000:.1f} Mbps" if bitrate else ""
+
+    reasons = transcode.get("TranscodeReasons") or []
+    reason_part = f"[{', '.join(reasons)}]" if reasons else ""
+
+    stream_parts = [p for p in [method, video_part, audio_part, bitrate_part, reason_part] if p]
+    client_str = " \u00b7 ".join(filter(None, [client, device]))
+
+    lines = [f"{user}: {title}", " | ".join(stream_parts)]
+    if client_str:
+        lines.append(client_str)
+
+    return "\n".join(lines)
+
+
+def load_state():
+    try:
+        with open(STATE_FILE) as f:
+            return json.load(f)
+    except (FileNotFoundError, json.JSONDecodeError):
+        return {}
+
+
+def save_state(state):
+    os.makedirs(os.path.dirname(STATE_FILE), exist_ok=True)
+    tmp = STATE_FILE + ".tmp"
+    with open(tmp, "w") as f:
+        json.dump(state, f)
+    os.replace(tmp, STATE_FILE)
+
+
+def grafana_post(label, start_ms):
+    try:
+        result = http_json(
+            "POST",
+            f"{GRAFANA_URL}/api/annotations",
+            {"time": start_ms, "text": label, "tags": ["jellyfin"]},
+        )
+        return result.get("id")
+    except Exception as e:
+        print(f"Error posting annotation: {e}", file=sys.stderr)
+        return None
+
+
+def grafana_close(grafana_id, end_ms):
+    try:
+        http_json(
+            "PATCH",
+            f"{GRAFANA_URL}/api/annotations/{grafana_id}",
+            {"timeEnd": end_ms},
+        )
+    except Exception as e:
+        print(f"Error closing annotation {grafana_id}: {e}", file=sys.stderr)
+
+
+def main():
+    api_key = get_api_key()
+    state = load_state()
+
+    while True:
+        now_ms = int(time.time() * 1000)
+        sessions = get_active_sessions(api_key)
+
+        if sessions is not None:
+            current_ids = {s["Id"] for s in sessions}
+
+            for s in sessions:
+                sid = s["Id"]
+                if sid not in state:
+                    label = format_label(s)
+                    grafana_id = grafana_post(label, now_ms)
+                    if grafana_id is not None:
+                        state[sid] = {
+                            "grafana_id": grafana_id,
+                            "label": label,
+                            "start_ms": now_ms,
+                        }
+                        save_state(state)
+
+            for sid in [k for k in state if k not in current_ids]:
+                info = state.pop(sid)
+                grafana_close(info["grafana_id"], now_ms)
+                save_state(state)
+
+        time.sleep(POLL_INTERVAL)
+
+
+if __name__ == "__main__":
+    main()
--- a/services/grafana/prometheus.nix
+++ b/services/grafana/prometheus.nix
@@ -0,0 +1,110 @@
+{
+  service_configs,
+  lib,
+  ...
+}:
+let
+  textfileDir = "/var/lib/prometheus-node-exporter-textfiles";
+in
+{
+  imports = [
+    (lib.serviceMountWithZpool "prometheus" service_configs.zpool_ssds [
+      "/var/lib/prometheus"
+    ])
+    (lib.serviceFilePerms "prometheus" [
+      "Z /var/lib/prometheus 0700 prometheus prometheus"
+    ])
+  ];
+
+  services.prometheus = {
+    enable = true;
+    port = service_configs.ports.private.prometheus.port;
+    listenAddress = "127.0.0.1";
+    stateDir = "prometheus";
+    retentionTime = "0d"; # 0 disables time-based retention (keep forever)
+
+    exporters = {
+      node = {
+        enable = true;
+        port = service_configs.ports.private.prometheus_node.port;
+        listenAddress = "127.0.0.1";
+        enabledCollectors = [
+          "hwmon"
+          "systemd"
+          "textfile"
+        ];
+        extraFlags = [
+          "--collector.textfile.directory=${textfileDir}"
+        ];
+      };
+
+      apcupsd = {
+        enable = true;
+        port = service_configs.ports.private.prometheus_apcupsd.port;
+        listenAddress = "127.0.0.1";
+        apcupsdAddress = "127.0.0.1:3551";
+      };
+
+      zfs = {
+        enable = true;
+        port = service_configs.ports.private.prometheus_zfs.port;
+        listenAddress = "127.0.0.1";
+      };
+    };
+
+    scrapeConfigs = [
+      {
+        job_name = "prometheus";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.prometheus.port}" ]; }
+        ];
+      }
+      {
+        job_name = "node";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.prometheus_node.port}" ]; }
+        ];
+      }
+      {
+        job_name = "apcupsd";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.prometheus_apcupsd.port}" ]; }
+        ];
+      }
+      {
+        job_name = "llama-cpp";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.llama_cpp.port}" ]; }
+        ];
+      }
+      {
+        job_name = "jellyfin";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.jellyfin_exporter.port}" ]; }
+        ];
+      }
+      {
+        job_name = "qbittorrent";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.qbittorrent_exporter.port}" ]; }
+        ];
+      }
+      {
+        job_name = "igpu";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.igpu_exporter.port}" ]; }
+        ];
+      }
+      {
+        job_name = "zfs";
+        static_configs = [
+          { targets = [ "127.0.0.1:${toString service_configs.ports.private.prometheus_zfs.port}" ]; }
+        ];
+      }
+    ];
+  };
+
+  systemd.tmpfiles.rules = [
+    "d ${textfileDir} 0755 root root -"
+  ];
+}
--- a/services/grafana/zfs-scrub-annotations.nix
+++ b/services/grafana/zfs-scrub-annotations.nix
@@ -0,0 +1,36 @@
+{
+  config,
+  pkgs,
+  service_configs,
+  lib,
+  ...
+}:
+let
+  grafanaUrl = "http://127.0.0.1:${toString service_configs.ports.private.grafana.port}";
+
+  script = pkgs.writeShellApplication {
+    name = "zfs-scrub-annotations";
+    runtimeInputs = with pkgs; [
+      curl
+      jq
+      coreutils
+      gnugrep
+      gnused
+      config.boot.zfs.package
+    ];
+    text = builtins.readFile ./zfs-scrub-annotations.sh;
+  };
+in
+lib.mkIf (config.services.grafana.enable && config.services.zfs.autoScrub.enable) {
+  systemd.services.zfs-scrub = {
+    environment = {
+      GRAFANA_URL = grafanaUrl;
+      STATE_DIR = "/run/zfs-scrub-annotations";
+    };
+    serviceConfig = {
+      RuntimeDirectory = "zfs-scrub-annotations";
+      ExecStartPre = [ "-${lib.getExe script} start" ];
+      ExecStopPost = [ "${lib.getExe script} stop" ];
+    };
+  };
+}
--- a/services/grafana/zfs-scrub-annotations.sh
+++ b/services/grafana/zfs-scrub-annotations.sh
@@ -0,0 +1,55 @@
+#!/usr/bin/env bash
+# ZFS scrub annotation script for Grafana
+# Usage: zfs-scrub-annotations.sh {start|stop}
+# Required env: GRAFANA_URL, STATE_DIR
+# Required on PATH: zpool, curl, jq, paste, date, grep, sed
+
+set -euo pipefail
+
+ACTION="${1:-}"
+GRAFANA_URL="${GRAFANA_URL:?GRAFANA_URL required}"
+STATE_DIR="${STATE_DIR:?STATE_DIR required}"
+
+case "$ACTION" in
+  start)
+    POOLS=$(zpool list -H -o name | paste -sd ', ')
+    NOW_MS=$(date +%s%3N)
+
+    RESPONSE=$(curl -sf --max-time 5 \
+      -X POST "$GRAFANA_URL/api/annotations" \
+      -H "Content-Type: application/json" \
+      -d "$(jq -n --arg text "ZFS scrub: $POOLS" --argjson time "$NOW_MS" \
+        '{time: $time, text: $text, tags: ["zfs-scrub"]}')" \
+    ) || exit 0
+
+    echo "$RESPONSE" | jq -r '.id' > "$STATE_DIR/annotation-id"
+    ;;
+
+  stop)
+    ANN_ID=$(cat "$STATE_DIR/annotation-id" 2>/dev/null) || exit 0
+    [ -z "$ANN_ID" ] && exit 0
+
+    NOW_MS=$(date +%s%3N)
+
+    RESULTS=""
+    while IFS= read -r pool; do
+      scan_line=$(zpool status "$pool" | grep "scan:" | sed 's/^[[:space:]]*//')
+      RESULTS="${RESULTS}${pool}: ${scan_line}"$'\n'
+    done < <(zpool list -H -o name)
+
+    TEXT=$(printf "ZFS scrub completed\n%s" "$RESULTS")
+
+    curl -sf --max-time 5 \
+      -X PATCH "$GRAFANA_URL/api/annotations/$ANN_ID" \
+      -H "Content-Type: application/json" \
+      -d "$(jq -n --arg text "$TEXT" --argjson timeEnd "$NOW_MS" \
+        '{timeEnd: $timeEnd, text: $text}')" || true
+
+    rm -f "$STATE_DIR/annotation-id"
+    ;;
+
+  *)
+    echo "Usage: $0 {start|stop}" >&2
+    exit 1
+    ;;
+esac
--- a/services/harmonia.nix
+++ b/services/harmonia.nix
@@ -0,0 +1,38 @@
+{
+  config,
+  lib,
+  service_configs,
+  ...
+}:
+{
+  imports = [
+    (lib.serviceFilePerms "harmonia" [
+      "Z /run/agenix/harmonia-sign-key 0400 harmonia harmonia"
+    ])
+  ];
+
+  services.harmonia = {
+    enable = true;
+    signKeyPaths = [ config.age.secrets.harmonia-sign-key.path ];
+    settings.bind = "127.0.0.1:${toString service_configs.ports.private.harmonia.port}";
+  };
+
+  # serve latest deploy store paths (unauthenticated — just a path string)
+  # CI writes to /var/lib/dotfiles-deploy/<hostname> after building
+  services.caddy.virtualHosts."nix-cache.${service_configs.https.domain}".extraConfig = ''
+    handle_path /deploy/* {
+        root * /var/lib/dotfiles-deploy
+        file_server
+    }
+
+    handle {
+        import ${config.age.secrets.nix-cache-auth.path}
+        reverse_proxy :${toString service_configs.ports.private.harmonia.port}
+    }
+  '';
+
+  # directory for CI to record latest deploy store paths
+  systemd.tmpfiles.rules = [
+    "d /var/lib/dotfiles-deploy 0755 gitea-runner gitea-runner"
+  ];
+}
--- a/services/immich.nix
+++ b/services/immich.nix
@@ -16,6 +16,15 @@
    (lib.serviceFilePerms "immich-server" [
      "Z ${config.services.immich.mediaLocation} 0770 ${config.services.immich.user} ${config.services.immich.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "immich";
+      port = service_configs.ports.private.immich.port;
+    })
+    (lib.mkFail2banJail {
+      name = "immich";
+      unitName = "immich-server.service";
+      failregex = "^.*Failed login attempt for user .* from ip address <HOST>.*$";
+    })
  ];

  services.immich = {
@@ -29,10 +38,6 @@
    };
  };

-  services.caddy.virtualHosts."immich.${service_configs.https.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString config.services.immich.port}
-  '';
-
  environment.systemPackages = with pkgs; [
    immich-go
  ];
@@ -42,18 +47,4 @@
    "render"
  ];

-  # Protect Immich login from brute force attacks
-  services.fail2ban.jails.immich = {
-    enabled = true;
-    settings = {
-      backend = "systemd";
-      port = "http,https";
-      # defaults: maxretry=5, findtime=10m, bantime=10m
-    };
-    filter.Definition = {
-      failregex = "^.*Failed login attempt for user .* from ip address <HOST>.*$";
-      ignoreregex = "";
-      journalmatch = "_SYSTEMD_UNIT=immich-server.service";
-    };
-  };
 }
--- a/services/jellyfin-qbittorrent-monitor.nix
+++ b/services/jellyfin-qbittorrent-monitor.nix
@@ -1,57 +0,0 @@
-{
-  pkgs,
-  service_configs,
-  config,
-  ...
-}:
-{
-  systemd.services."jellyfin-qbittorrent-monitor" = {
-    description = "Monitor Jellyfin streaming and control qBittorrent rate limits";
-    after = [
-      "network.target"
-      "jellyfin.service"
-      "qbittorrent.service"
-    ];
-    wantedBy = [ "multi-user.target" ];
-
-    serviceConfig = {
-      Type = "simple";
-      ExecStart = pkgs.writeShellScript "jellyfin-monitor-start" ''
-        export JELLYFIN_API_KEY=$(cat $CREDENTIALS_DIRECTORY/jellyfin-api-key)
-        exec ${
-          pkgs.python3.withPackages (ps: with ps; [ requests ])
-        }/bin/python ${./jellyfin-qbittorrent-monitor.py}
-      '';
-      Restart = "always";
-      RestartSec = "10s";
-
-      # Security hardening
-      DynamicUser = true;
-      NoNewPrivileges = true;
-      ProtectSystem = "strict";
-      ProtectHome = true;
-      ProtectKernelTunables = true;
-      ProtectKernelModules = true;
-      ProtectControlGroups = true;
-      MemoryDenyWriteExecute = true;
-      RestrictRealtime = true;
-      RestrictSUIDSGID = true;
-      RemoveIPC = true;
-
-      # Load credentials from agenix secrets
-      LoadCredential = "jellyfin-api-key:${config.age.secrets.jellyfin-api-key.path}";
-    };
-
-    environment = {
-      JELLYFIN_URL = "http://localhost:${builtins.toString service_configs.ports.private.jellyfin.port}";
-      QBITTORRENT_URL = "http://${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.torrent.port}";
-      CHECK_INTERVAL = "30";
-      # Bandwidth budget configuration
-      TOTAL_BANDWIDTH_BUDGET = "30000000"; # 30 Mbps in bits per second
-      SERVICE_BUFFER = "5000000"; # 5 Mbps reserved for other services (bps)
-      DEFAULT_STREAM_BITRATE = "10000000"; # 10 Mbps fallback when bitrate unknown (bps)
-      MIN_TORRENT_SPEED = "100"; # KB/s - below this, pause torrents instead
-      STREAM_BITRATE_HEADROOM = "1.1"; # multiplier per stream for bitrate fluctuations
-    };
-  };
-}
--- a/services/jellyfin/default.nix
+++ b/services/jellyfin/default.nix
@@ -0,0 +1,6 @@
+{
+  imports = [
+    ./jellyfin.nix
+    ./jellyfin-qbittorrent-monitor.nix
+  ];
+}
--- a/services/jellyfin/jellyfin-qbittorrent-monitor.nix
+++ b/services/jellyfin/jellyfin-qbittorrent-monitor.nix
@@ -0,0 +1,127 @@
+{
+  pkgs,
+  service_configs,
+  config,
+  lib,
+  ...
+}:
+let
+  webhookPlugin = import ./jellyfin-webhook-plugin.nix { inherit pkgs lib; };
+  jellyfinPort = service_configs.ports.private.jellyfin.port;
+  webhookPort = service_configs.ports.private.jellyfin_qbittorrent_monitor_webhook.port;
+in
+lib.mkIf config.services.jellyfin.enable {
+  # Materialise the Jellyfin Webhook plugin into Jellyfin's plugins dir before
+  # Jellyfin starts. Jellyfin rewrites meta.json at runtime, so a read-only
+  # nix-store symlink would EACCES -- we copy instead.
+  #
+  # `wantedBy = [ "jellyfin.service" ]` alone is insufficient on initial rollout:
+  # if jellyfin is already running at activation time, systemd won't start the
+  # oneshot until the next jellyfin restart. `restartTriggers` on jellyfin pinned
+  # to the plugin package + install script forces that restart whenever either
+  # changes, which invokes this unit via the `before`/`wantedBy` chain.
+  systemd.services.jellyfin-webhook-install = {
+    before = [ "jellyfin.service" ];
+    wantedBy = [ "jellyfin.service" ];
+    serviceConfig = {
+      Type = "oneshot";
+      RemainAfterExit = true;
+      User = config.services.jellyfin.user;
+      Group = config.services.jellyfin.group;
+      ExecStart = webhookPlugin.mkInstallScript {
+        pluginsDir = "${config.services.jellyfin.dataDir}/plugins";
+      };
+    };
+  };
+
+  systemd.services.jellyfin.restartTriggers = [
+    webhookPlugin.package
+    (webhookPlugin.mkInstallScript {
+      pluginsDir = "${config.services.jellyfin.dataDir}/plugins";
+    })
+  ];
+
+  # After Jellyfin starts, POST the plugin configuration so the webhook
+  # targets the monitor's receiver. Idempotent; runs on every boot.
+  systemd.services.jellyfin-webhook-configure = {
+    after = [ "jellyfin.service" ];
+    wants = [ "jellyfin.service" ];
+    before = [ "jellyfin-qbittorrent-monitor.service" ];
+    wantedBy = [ "multi-user.target" ];
+    serviceConfig = {
+      Type = "oneshot";
+      RemainAfterExit = true;
+      DynamicUser = true;
+      LoadCredential = "jellyfin-api-key:${config.age.secrets.jellyfin-api-key.path}";
+      ExecStart = webhookPlugin.mkConfigureScript {
+        jellyfinUrl = "http://127.0.0.1:${toString jellyfinPort}";
+        webhooks = [
+          {
+            name = "qBittorrent Monitor";
+            uri = "http://127.0.0.1:${toString webhookPort}/";
+            notificationTypes = [
+              "PlaybackStart"
+              "PlaybackProgress"
+              "PlaybackStop"
+            ];
+          }
+        ];
+      };
+    };
+  };
+
+  systemd.services."jellyfin-qbittorrent-monitor" = {
+    description = "Monitor Jellyfin streaming and control qBittorrent rate limits";
+    after = [
+      "network.target"
+      "jellyfin.service"
+      "qbittorrent.service"
+      "jellyfin-webhook-configure.service"
+    ];
+    wants = [ "jellyfin-webhook-configure.service" ];
+    wantedBy = [ "multi-user.target" ];
+
+    serviceConfig = {
+      Type = "simple";
+      ExecStart = pkgs.writeShellScript "jellyfin-monitor-start" ''
+        export JELLYFIN_API_KEY=$(cat $CREDENTIALS_DIRECTORY/jellyfin-api-key)
+        exec ${
+          pkgs.python3.withPackages (ps: with ps; [ requests ])
+        }/bin/python ${./jellyfin-qbittorrent-monitor.py}
+      '';
+      Restart = "always";
+      RestartSec = "10s";
+
+      # Security hardening
+      DynamicUser = true;
+      NoNewPrivileges = true;
+      ProtectSystem = "strict";
+      ProtectHome = true;
+      ProtectKernelTunables = true;
+      ProtectKernelModules = true;
+      ProtectControlGroups = true;
+      MemoryDenyWriteExecute = true;
+      RestrictRealtime = true;
+      RestrictSUIDSGID = true;
+      RemoveIPC = true;
+
+      # Load credentials from agenix secrets
+      LoadCredential = "jellyfin-api-key:${config.age.secrets.jellyfin-api-key.path}";
+    };
+
+    environment = {
+      JELLYFIN_URL = "http://localhost:${builtins.toString jellyfinPort}";
+      QBITTORRENT_URL = "http://${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString service_configs.ports.private.torrent.port}";
+      CHECK_INTERVAL = "30";
+      # Bandwidth budget configuration
+      TOTAL_BANDWIDTH_BUDGET = "30000000"; # 30 Mbps in bits per second
+      SERVICE_BUFFER = "5000000"; # 5 Mbps reserved for other services (bps)
+      DEFAULT_STREAM_BITRATE = "10000000"; # 10 Mbps fallback when bitrate unknown (bps)
+      MIN_TORRENT_SPEED = "100"; # KB/s - below this, pause torrents instead
+      STREAM_BITRATE_HEADROOM = "1.1"; # multiplier per stream for bitrate fluctuations
+      # Webhook receiver: Jellyfin Webhook plugin POSTs events here to throttle immediately.
+      WEBHOOK_BIND = "127.0.0.1";
+      WEBHOOK_PORT = toString webhookPort;
+    };
+  };
+}
--- a/services/jellyfin/jellyfin-qbittorrent-monitor.py
+++ b/services/jellyfin/jellyfin-qbittorrent-monitor.py
@@ -7,6 +7,8 @@ import sys
 import signal
 import json
 import ipaddress
+import threading
+from http.server import HTTPServer, BaseHTTPRequestHandler

 logging.basicConfig(
    level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
@@ -34,6 +36,8 @@ class JellyfinQBittorrentMonitor:
        default_stream_bitrate=10000000,
        min_torrent_speed=100,
        stream_bitrate_headroom=1.1,
+        webhook_port=0,
+        webhook_bind="127.0.0.1",
    ):
        self.jellyfin_url = jellyfin_url
        self.qbittorrent_url = qbittorrent_url
@@ -57,6 +61,12 @@ class JellyfinQBittorrentMonitor:
        self.streaming_stop_delay = streaming_stop_delay
        self.last_state_change = 0

+        # Webhook receiver: allows Jellyfin to push events instead of waiting for the poll
+        self.webhook_port = webhook_port
+        self.webhook_bind = webhook_bind
+        self.wake_event = threading.Event()
+        self.webhook_server = None
+
        # Local network ranges (RFC 1918 private networks + localhost)
        self.local_networks = [
            ipaddress.ip_network("10.0.0.0/8"),
@@ -79,9 +89,56 @@ class JellyfinQBittorrentMonitor:
    def signal_handler(self, signum, frame):
        logger.info("Received shutdown signal, cleaning up...")
        self.running = False
+        if self.webhook_server is not None:
+            # shutdown() blocks until serve_forever returns; run from a thread so we don't deadlock
+            threading.Thread(target=self.webhook_server.shutdown, daemon=True).start()
        self.restore_normal_limits()
        sys.exit(0)

+    def wake(self) -> None:
+        """Signal the main loop to re-evaluate state immediately."""
+        self.wake_event.set()
+
+    def sleep_or_wake(self, seconds: float) -> None:
+        """Wait up to `seconds`, returning early if a webhook wakes the loop."""
+        self.wake_event.wait(seconds)
+        self.wake_event.clear()
+
+    def start_webhook_server(self) -> None:
+        """Start a background HTTP server that wakes the monitor on any POST."""
+        if not self.webhook_port:
+            return
+
+        monitor = self
+
+        class WebhookHandler(BaseHTTPRequestHandler):
+            def do_POST(self):  # noqa: N802
+                length = int(self.headers.get("Content-Length", "0") or "0")
+                body = self.rfile.read(min(length, 65536)) if length else b""
+                event = "unknown"
+                try:
+                    if body:
+                        event = json.loads(body).get("NotificationType", "unknown")
+                except (json.JSONDecodeError, ValueError):
+                    pass
+                logger.info(f"Webhook received: {event}")
+                self.send_response(204)
+                self.end_headers()
+                monitor.wake()
+
+            def log_message(self, format, *args):
+                return  # suppress default access log
+
+        self.webhook_server = HTTPServer(
+            (self.webhook_bind, self.webhook_port), WebhookHandler
+        )
+        threading.Thread(
+            target=self.webhook_server.serve_forever, daemon=True, name="webhook-server"
+        ).start()
+        logger.info(
+            f"Webhook receiver listening on http://{self.webhook_bind}:{self.webhook_port}"
+        )
+
    def check_jellyfin_sessions(self) -> list[dict]:
        headers = (
            {"X-Emby-Token": self.jellyfin_api_key} if self.jellyfin_api_key else {}
@@ -297,10 +354,14 @@ class JellyfinQBittorrentMonitor:
        logger.info(f"Default stream bitrate: {self.default_stream_bitrate} bps")
        logger.info(f"Minimum torrent speed: {self.min_torrent_speed} KB/s")
        logger.info(f"Stream bitrate headroom: {self.stream_bitrate_headroom}x")
+        if self.webhook_port:
+            logger.info(f"Webhook receiver: {self.webhook_bind}:{self.webhook_port}")

        signal.signal(signal.SIGINT, self.signal_handler)
        signal.signal(signal.SIGTERM, self.signal_handler)

+        self.start_webhook_server()
+
        while self.running:
            try:
                self.sync_qbittorrent_state()
@@ -309,7 +370,7 @@ class JellyfinQBittorrentMonitor:
                    active_streams = self.check_jellyfin_sessions()
                except ServiceUnavailable:
                    logger.warning("Jellyfin unavailable, maintaining current state")
-                    time.sleep(self.check_interval)
+                    self.sleep_or_wake(self.check_interval)
                    continue

                streaming_active = len(active_streams) > 0
@@ -394,13 +455,13 @@ class JellyfinQBittorrentMonitor:

                self.current_state = desired_state
                self.last_active_streams = active_streams
-                time.sleep(self.check_interval)
+                self.sleep_or_wake(self.check_interval)

            except KeyboardInterrupt:
                break
            except Exception as e:
                logger.error(f"Unexpected error in monitoring loop: {e}")
-                time.sleep(self.check_interval)
+                self.sleep_or_wake(self.check_interval)

        self.restore_normal_limits()
        logger.info("Monitor stopped")
@@ -421,6 +482,8 @@ if __name__ == "__main__":
    default_stream_bitrate = int(os.getenv("DEFAULT_STREAM_BITRATE", "10000000"))
    min_torrent_speed = int(os.getenv("MIN_TORRENT_SPEED", "100"))
    stream_bitrate_headroom = float(os.getenv("STREAM_BITRATE_HEADROOM", "1.1"))
+    webhook_port = int(os.getenv("WEBHOOK_PORT", "0"))
+    webhook_bind = os.getenv("WEBHOOK_BIND", "127.0.0.1")

    monitor = JellyfinQBittorrentMonitor(
        jellyfin_url=jellyfin_url,
@@ -434,6 +497,8 @@ if __name__ == "__main__":
        default_stream_bitrate=default_stream_bitrate,
        min_torrent_speed=min_torrent_speed,
        stream_bitrate_headroom=stream_bitrate_headroom,
+        webhook_port=webhook_port,
+        webhook_bind=webhook_bind,
    )

    monitor.run()
--- a/services/jellyfin/jellyfin-webhook-plugin.nix
+++ b/services/jellyfin/jellyfin-webhook-plugin.nix
@@ -0,0 +1,105 @@
+{ pkgs, lib }:
+let
+  pluginVersion = "18.0.0.0";
+  # GUID from the plugin's meta.json; addresses it on /Plugins/<guid>/Configuration.
+  pluginGuid = "71552a5a-5c5c-4350-a2ae-ebe451a30173";
+
+  package = pkgs.stdenvNoCC.mkDerivation {
+    pname = "jellyfin-plugin-webhook";
+    version = pluginVersion;
+    src = pkgs.fetchurl {
+      url = "https://repo.jellyfin.org/files/plugin/webhook/webhook_${pluginVersion}.zip";
+      hash = "sha256-LFFojiPnBGl9KJ0xVyPBnCmatcaeVbllRwRkz5Z3dqI=";
+    };
+    nativeBuildInputs = [ pkgs.unzip ];
+    unpackPhase = ''unzip "$src"'';
+    installPhase = ''
+      mkdir -p "$out"
+      cp *.dll meta.json "$out/"
+    '';
+    dontFixup = true; # managed .NET assemblies must not be patched
+  };
+
+  # Minimal Handlebars template, base64 encoded. The monitor only needs the POST;
+  # NotificationType is parsed for the debug log line.
+  # Decoded: {"NotificationType":"{{NotificationType}}"}
+  templateB64 = "eyJOb3RpZmljYXRpb25UeXBlIjoie3tOb3RpZmljYXRpb25UeXBlfX0ifQ==";
+
+  # Build a PluginConfiguration payload accepted by Jellyfin's JSON deserializer.
+  # Each webhook is `{ name, uri, notificationTypes }`.
+  mkConfigJson =
+    webhooks:
+    builtins.toJSON {
+      ServerUrl = "";
+      GenericOptions = map (w: {
+        NotificationTypes = w.notificationTypes;
+        WebhookName = w.name;
+        WebhookUri = w.uri;
+        EnableMovies = true;
+        EnableEpisodes = true;
+        EnableVideos = true;
+        EnableWebhook = true;
+        Template = templateB64;
+        Headers = [
+          {
+            Key = "Content-Type";
+            Value = "application/json";
+          }
+        ];
+      }) webhooks;
+    };
+
+  # Oneshot that POSTs the plugin configuration. Retries past the window
+  # between Jellyfin API health and plugin registration.
+  mkConfigureScript =
+    { jellyfinUrl, webhooks }:
+    pkgs.writeShellScript "jellyfin-webhook-configure" ''
+      set -euo pipefail
+      export PATH=${
+        lib.makeBinPath [
+          pkgs.coreutils
+          pkgs.curl
+        ]
+      }
+
+      URL=${lib.escapeShellArg jellyfinUrl}
+      AUTH="Authorization: MediaBrowser Token=\"$(cat "$CREDENTIALS_DIRECTORY/jellyfin-api-key")\""
+      CONFIG=${lib.escapeShellArg (mkConfigJson webhooks)}
+
+      for _ in $(seq 1 120); do curl -sf -o /dev/null "$URL/health" && break; sleep 1; done
+      curl -sf -o /dev/null "$URL/health"
+
+      for _ in $(seq 1 60); do
+        if printf '%s' "$CONFIG" | curl -sf -X POST \
+          -H "$AUTH" -H "Content-Type: application/json" --data-binary @- \
+          "$URL/Plugins/${pluginGuid}/Configuration"; then
+          echo "Jellyfin webhook plugin configured"; exit 0
+        fi
+        sleep 1
+      done
+      echo "Failed to configure webhook plugin" >&2; exit 1
+    '';
+
+  # Materialise a writable copy of the plugin. Jellyfin rewrites meta.json at
+  # runtime, so a read-only nix-store symlink would EACCES.
+  mkInstallScript =
+    { pluginsDir }:
+    pkgs.writeShellScript "jellyfin-webhook-install" ''
+      set -euo pipefail
+      export PATH=${lib.makeBinPath [ pkgs.coreutils ]}
+      dst=${lib.escapeShellArg "${pluginsDir}/Webhook_${pluginVersion}"}
+      mkdir -p ${lib.escapeShellArg pluginsDir}
+      rm -rf "$dst" && mkdir -p "$dst"
+      cp ${package}/*.dll ${package}/meta.json "$dst/"
+      chmod u+rw "$dst"/*
+    '';
+in
+{
+  inherit
+    package
+    pluginVersion
+    pluginGuid
+    mkConfigureScript
+    mkInstallScript
+    ;
+}
--- a/services/jellyfin/jellyfin.nix
+++ b/services/jellyfin/jellyfin.nix
@@ -26,6 +26,14 @@

  services.caddy.virtualHosts."jellyfin.${service_configs.https.domain}".extraConfig = ''
    reverse_proxy :${builtins.toString service_configs.ports.private.jellyfin.port} {
+      # Disable response buffering for streaming. Caddy's default partial
+      # buffering delays fMP4-HLS segments and direct-play responses where
+      # Content-Length is known (so auto-flush doesn't trigger).
+      flush_interval -1
+      transport http {
+        # Localhost: compression wastes CPU re-encoding already-compressed media.
+        compression off
+      }
      header_up X-Real-IP {remote_host}
      header_up X-Forwarded-For {remote_host}
      header_up X-Forwarded-Proto {scheme}
--- a/services/llama-cpp.nix
+++ b/services/llama-cpp.nix
@@ -0,0 +1,103 @@
+{
+  pkgs,
+  service_configs,
+  config,
+  inputs,
+  lib,
+  utils,
+  ...
+}:
+let
+  cfg = config.services.llama-cpp;
+  modelUrl = "https://huggingface.co/bartowski/google_gemma-4-E2B-it-GGUF/resolve/main/google_gemma-4-E2B-it-IQ2_M.gguf";
+  modelAlias = lib.removeSuffix ".gguf" (baseNameOf modelUrl);
+in
+{
+  imports = [
+    (lib.mkCaddyReverseProxy {
+      subdomain = "llm";
+      port = service_configs.ports.private.llama_cpp.port;
+    })
+  ];
+
+  services.llama-cpp = {
+    enable = true;
+    model = toString (
+      pkgs.fetchurl {
+        url = modelUrl;
+        sha256 = "17e869ac54d0e59faa884d5319fc55ad84cd866f50f0b3073fbb25accc875a23";
+      }
+    );
+    port = service_configs.ports.private.llama_cpp.port;
+    host = "0.0.0.0";
+    package = lib.optimizePackage (
+      inputs.llamacpp.packages.${pkgs.system}.vulkan.overrideAttrs (old: {
+        patches = (old.patches or [ ]) ++ [
+        ];
+      })
+    );
+    extraFlags = [
+      "-ngl"
+      "999"
+      "-c"
+      "65536"
+      "-ctk"
+      "turbo3"
+      "-ctv"
+      "turbo3"
+      "-fa"
+      "on"
+      "--api-key-file"
+      config.age.secrets.llama-cpp-api-key.path
+      "--metrics"
+      "--alias"
+      modelAlias
+      "-b"
+      "4096"
+      "-ub"
+      "4096"
+      "--parallel"
+      "2"
+    ];
+  };
+
+  # have to do this in order to get vulkan to work
+  systemd.services.llama-cpp.serviceConfig.DynamicUser = lib.mkForce false;
+
+  # ANV driver's turbo3 shader compilation exceeds the default 8 MB thread stack.
+  systemd.services.llama-cpp.serviceConfig.LimitSTACK = lib.mkForce "67108864"; # 64 MB soft+hard
+
+  # llama-server tries to create ~/.cache; ProtectSystem=strict + impermanent
+  # root make /root read-only. Give it a writable cache dir and point HOME there.
+  systemd.services.llama-cpp.serviceConfig.CacheDirectory = "llama-cpp";
+  systemd.services.llama-cpp.environment.HOME = "/var/cache/llama-cpp";
+
+  # turbo3 KV cache quantization runs a 14-barrier WHT butterfly per 128-element
+  # workgroup in SET_ROWS. With 4 concurrent slots and batch=4096, the combined
+  # GPU dispatch can exceed the default i915 CCS engine preempt timeout (7.5s),
+  # causing GPU HANG -> ErrorDeviceLost. Increase compute engine timeouts.
+  # Note: batch<4096 is not viable -- GDN chunked mode needs a larger compute
+  # buffer at smaller batch sizes, exceeding the A380's 6 GB VRAM.
+  # '+' prefix runs as root regardless of service User=.
+  systemd.services.llama-cpp.serviceConfig.ExecStartPre = [
+    "+${pkgs.writeShellScript "set-gpu-compute-timeout" ''
+      for f in /sys/class/drm/card*/engine/ccs*/preempt_timeout_ms; do
+        [ -w "$f" ] && echo 30000 > "$f"
+      done
+      for f in /sys/class/drm/card*/engine/ccs*/heartbeat_interval_ms; do
+        [ -w "$f" ] && echo 10000 > "$f"
+      done
+    ''}"
+  ];
+
+  # upstream module hardcodes --log-disable; override ExecStart to keep logs
+  # so we can see prompt processing progress via journalctl
+  systemd.services.llama-cpp.serviceConfig.ExecStart = lib.mkForce (
+    "${cfg.package}/bin/llama-server"
+    + " --host ${cfg.host}"
+    + " --port ${toString cfg.port}"
+    + " -m ${cfg.model}"
+    + " ${utils.escapeSystemdExecArgs cfg.extraFlags}"
+  );
+
+}
--- a/services/matrix/coturn.nix
+++ b/services/matrix/coturn.nix
@@ -9,7 +9,7 @@
    enable = true;
    realm = service_configs.https.domain;
    use-auth-secret = true;
-    static-auth-secret = lib.strings.trim (builtins.readFile ../secrets/coturn_static_auth_secret);
+    static-auth-secret-file = config.age.secrets.coturn-auth-secret.path;
    listening-port = service_configs.ports.public.coturn.port;
    tls-listening-port = service_configs.ports.public.coturn_tls.port;
    no-cli = true;
--- a/services/matrix/default.nix
+++ b/services/matrix/default.nix
@@ -0,0 +1,7 @@
+{
+  imports = [
+    ./matrix.nix
+    ./coturn.nix
+    ./livekit.nix
+  ];
+}
--- a/services/matrix/livekit.nix
+++ b/services/matrix/livekit.nix
@@ -3,7 +3,7 @@
  ...
 }:
 let
-  keyFile = ../secrets/livekit_keys;
+  keyFile = ../../secrets/livekit_keys;
 in
 {
  services.livekit = {
--- a/services/matrix/matrix.nix
+++ b/services/matrix/matrix.nix
@@ -12,6 +12,10 @@
    (lib.serviceFilePerms "continuwuity" [
      "Z /var/lib/private/continuwuity 0770 ${config.services.matrix-continuwuity.user} ${config.services.matrix-continuwuity.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      domain = service_configs.matrix.domain;
+      port = service_configs.ports.private.matrix.port;
+    })
  ];

  services.matrix-continuwuity = {
@@ -21,7 +25,7 @@
      port = [ service_configs.ports.private.matrix.port ];
      server_name = service_configs.https.domain;
      allow_registration = true;
-      registration_token = lib.strings.trim (builtins.readFile ../secrets/matrix_reg_token);
+      registration_token_file = config.age.secrets.matrix-reg-token.path;

      new_user_displayname_suffix = "";

@@ -37,7 +41,7 @@
      ];

      # TURN server config (coturn)
-      turn_secret = config.services.coturn.static-auth-secret;
+      turn_secret_file = config.age.secrets.matrix-turn-secret.path;
      turn_uris = [
        "turn:${service_configs.https.domain}?transport=udp"
        "turn:${service_configs.https.domain}?transport=tcp"
@@ -53,10 +57,6 @@
    respond /.well-known/matrix/client `{"m.server":{"base_url":"https://${service_configs.matrix.domain}"},"m.homeserver":{"base_url":"https://${service_configs.matrix.domain}"},"org.matrix.msc3575.proxy":{"base_url":"https://${config.services.matrix-continuwuity.settings.global.server_name}"},"org.matrix.msc4143.rtc_foci":[{"type":"livekit","livekit_service_url":"https://${service_configs.livekit.domain}"}]}`
  '';

-  services.caddy.virtualHosts."${service_configs.matrix.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString service_configs.ports.private.matrix.port}
-  '';
-
  # Exact duplicate for federation port
  services.caddy.virtualHosts."${service_configs.matrix.domain}:${builtins.toString service_configs.ports.public.matrix_federation.port}".extraConfig =
    config.services.caddy.virtualHosts."${service_configs.matrix.domain}".extraConfig;
--- a/services/minecraft.nix
+++ b/services/minecraft.nix
@@ -37,15 +37,21 @@

    servers.${service_configs.minecraft.server_name} = {
      enable = true;
-      package = pkgs.fabricServers.fabric-1_21_11;
+      package = pkgs.fabricServers.fabric-26_1_2.override { jre_headless = pkgs.openjdk25_headless; };

      jvmOpts = lib.concatStringsSep " " [
        # Memory
        "-Xmx${builtins.toString service_configs.minecraft.memory.heap_size_m}M"
        "-Xms${builtins.toString service_configs.minecraft.memory.heap_size_m}M"
+
        # GC
        "-XX:+UseZGC"
        "-XX:+ZGenerational"
+
+        # added in new minecraft version
+        "-XX:+UseCompactObjectHeaders"
+        "-XX:+UseStringDeduplication"
+
        # Base JVM optimizations (brucethemoose/Minecraft-Performance-Flags-Benchmarks)
        "-XX:+UnlockExperimentalVMOptions"
        "-XX:+UnlockDiagnosticVMOptions"
@@ -67,6 +73,7 @@
        "-XX:NonProfiledCodeHeapSize=194M"
        "-XX:NmethodSweepActivity=1"
        "-XX:+UseVectorCmov"
+
        # Large pages (requires vm.nr_hugepages sysctl)
        "-XX:+UseLargePages"
        "-XX:LargePageSizeInBytes=${builtins.toString service_configs.minecraft.memory.large_page_size_m}M"
@@ -92,71 +99,68 @@
          with pkgs;
          builtins.attrValues {
            FabricApi = fetchurl {
-              url = "https://cdn.modrinth.com/data/P7dR8mSH/versions/i5tSkVBH/fabric-api-0.141.3%2B1.21.11.jar";
-              sha512 = "c20c017e23d6d2774690d0dd774cec84c16bfac5461da2d9345a1cd95eee495b1954333c421e3d1c66186284d24a433f6b0cced8021f62e0bfa617d2384d0471";
+              url = "https://cdn.modrinth.com/data/P7dR8mSH/versions/fm7UYECV/fabric-api-0.145.4%2B26.1.2.jar";
+              sha512 = "ffd5ef62a745f76cd2e5481252cb7bc67006c809b4f436827d05ea22c01d19279e94a3b24df3d57e127af1cd08440b5de6a92a4ea8f39b2dcbbe1681275564c3";
            };

-            FerriteCore = fetchurl {
-              url = "https://cdn.modrinth.com/data/uXXizFIs/versions/Ii0gP3D8/ferritecore-8.2.0-fabric.jar";
-              sha512 = "3210926a82eb32efd9bcebabe2f6c053daf5c4337eebc6d5bacba96d283510afbde646e7e195751de795ec70a2ea44fef77cb54bf22c8e57bb832d6217418869";
-            };
+            # No 26.1.2 version available
+            # FerriteCore = fetchurl {
+            #   url = "https://cdn.modrinth.com/data/uXXizFIs/versions/d5ddUdiB/ferritecore-9.0.0-fabric.jar";
+            #   sha512 = "d81fa97e11784c19d42f89c2f433831d007603dd7193cee45fa177e4a6a9c52b384b198586e04a0f7f63cd996fed713322578bde9a8db57e1188854ae5cbe584";
+            # };

            Lithium = fetchurl {
-              url = "https://cdn.modrinth.com/data/gvQqBUqZ/versions/Ow7wA0kG/lithium-fabric-0.21.4%2Bmc1.21.11.jar";
-              sha512 = "f14a5c3d2fad786347ca25083f902139694f618b7c103947f2fd067a7c5ee88a63e1ef8926f7d693ea79ed7d00f57317bae77ef9c2d630bf5ed01ac97a752b94";
+              url = "https://cdn.modrinth.com/data/gvQqBUqZ/versions/v2xoRvRP/lithium-fabric-0.24.1%2Bmc26.1.2.jar";
+              sha512 = "8711bc8c6f39be4c8511becb7a68e573ced56777bd691639f2fc62299b35bb4ccd2efe4a39bd9c308084b523be86a5f5c4bf921ab85f7a22bf075d8ea2359621";
            };

            NoChatReports = fetchurl {
-              url = "https://cdn.modrinth.com/data/qQyHxfxd/versions/rhykGstm/NoChatReports-FABRIC-1.21.11-v2.18.0.jar";
-              sha512 = "d2c35cc8d624616f441665aff67c0e366e4101dba243bad25ed3518170942c1a3c1a477b28805cd1a36c44513693b1c55e76bea627d3fced13927a3d67022ccc";
+              url = "https://cdn.modrinth.com/data/qQyHxfxd/versions/2yrLNE3S/NoChatReports-FABRIC-26.1-v2.19.0.jar";
+              sha512 = "94d58a1a4cde4e3b1750bdf724e65c5f4ff3436c2532f36a465d497d26bf59f5ac996cddbff8ecdfed770c319aa2f2dcc9c7b2d19a35651c2a7735c5b2124dad";
            };

            squaremap = fetchurl {
-              url = "https://cdn.modrinth.com/data/PFb7ZqK6/versions/BW8lMXBi/squaremap-fabric-mc1.21.11-1.3.12.jar";
-              sha512 = "f62eb791a3f5812eb174565d318f2e6925353f846ef8ac56b4e595f481494e0c281f26b9e9fcfdefa855093c96b735b12f67ee17c07c2477aa7a3439238670d9";
+              url = "https://cdn.modrinth.com/data/PFb7ZqK6/versions/UBN6MFvH/squaremap-fabric-mc26.1.2-1.3.13.jar";
+              sha512 = "97bc130184b5d0ddc4ff98a15acef6203459d982e0e2afbd49a2976d546c55a86ef22b841378b51dd782be9b2cfbe4cfa197717f2b7f6800fd8b4ff4df6e564f";
            };

            scalablelux = fetchurl {
-              url = "https://cdn.modrinth.com/data/Ps1zyz6x/versions/PV9KcrYQ/ScalableLux-0.1.6%2Bfabric.c25518a-all.jar";
-              sha512 = "729515c1e75cf8d9cd704f12b3487ddb9664cf9928e7b85b12289c8fbbc7ed82d0211e1851375cbd5b385820b4fedbc3f617038fff5e30b302047b0937042ae7";
+              url = "https://cdn.modrinth.com/data/Ps1zyz6x/versions/gYbHVCz8/ScalableLux-0.2.0%2Bfabric.2b63825-all.jar";
+              sha512 = "48565a4d8a1cbd623f0044086d971f2c0cf1c40e1d0b6636a61d41512f4c1c1ddff35879d9dba24b088a670ee254e2d5842d13a30b6d76df23706fa94ea4a58b";
            };

            c2me = fetchurl {
-              url = "https://cdn.modrinth.com/data/VSNURh3q/versions/QdLiMUjx/c2me-fabric-mc1.21.11-0.3.7%2Balpha.0.7.jar";
-              sha512 = "f9543febe2d649a82acd6d5b66189b6a3d820cf24aa503ba493fdb3bbd4e52e30912c4c763fe50006f9a46947ae8cd737d420838c61b93429542573ed67f958e";
+              url = "https://cdn.modrinth.com/data/VSNURh3q/versions/yrNQQ1AQ/c2me-fabric-mc26.1.2-0.3.7%2Balpha.0.65.jar";
+              sha512 = "6666ebaa3bfa403e386776590fc845b7c306107d37ebc7b1be3b057893fbf9f933abb2314c171d7fe19c177cf8823cb47fdc32040d34a9704f5ab656dd5d93f8";
            };

-            krypton = fetchurl {
-              url = "https://cdn.modrinth.com/data/fQEb0iXm/versions/O9LmWYR7/krypton-0.2.10.jar";
-              sha512 = "4dcd7228d1890ddfc78c99ff284b45f9cf40aae77ef6359308e26d06fa0d938365255696af4cc12d524c46c4886cdcd19268c165a2bf0a2835202fe857da5cab";
-            };
+            # No 26.1 version available
+            # krypton = fetchurl {
+            #   url = "https://cdn.modrinth.com/data/fQEb0iXm/versions/O9LmWYR7/krypton-0.2.10.jar";
+            #   sha512 = "4dcd7228d1890ddfc78c99ff284b45f9cf40aae77ef6359308e26d06fa0d938365255696af4cc12d524c46c4886cdcd19268c165a2bf0a2835202fe857da5cab";
+            # };

-            better-fabric-console = fetchurl {
-              url = "https://cdn.modrinth.com/data/Y8o1j1Sf/versions/6aIKl5wy/better-fabric-console-mc1.21.11-1.2.9.jar";
-              sha512 = "427247dafd99df202ee10b4bf60ffcbbecbabfadb01c167097ffb5b85670edb811f4d061c2551be816295cbbc6b8ec5ec464c14a6ff41912ef1f6c57b038d320";
-            };
-
-            disconnect-packet-fix = fetchurl {
-              url = "https://cdn.modrinth.com/data/rd9rKuJT/versions/Gv74xveQ/disconnect-packet-fix-fabric-2.0.0.jar";
-              sha512 = "1fd6f09a41ce36284e1a8e9def53f3f6834d7201e69e54e24933be56445ba569fbc26278f28300d36926ba92db6f4f9c0ae245d23576aaa790530345587316db";
-            };
+            # No 26.1.2 version available
+            # disconnect-packet-fix = fetchurl {
+            #   url = "https://cdn.modrinth.com/data/rd9rKuJT/versions/x9gVeaTU/disconnect-packet-fix-fabric-2.1.0.jar";
+            #   sha512 = "bf84d02bdcd737706df123e452dd31ef535580fa4ced6af1e4ceea022fef94e4764775253e970b8caa1292e2fa00eb470557f70b290fafdb444479fa801b07a1";
+            # };

            packet-fixer = fetchurl {
-              url = "https://cdn.modrinth.com/data/c7m1mi73/versions/CUh1DWeO/packetfixer-fabric-3.3.4-1.21.11.jar";
-              sha512 = "33331b16cb40c5e6fbaade3cacc26f3a0e8fa5805a7186f94d7366a0e14dbeee9de2d2e8c76fa71f5e9dd24eb1c261667c35447e32570ea965ca0f154fdfba0a";
+              url = "https://cdn.modrinth.com/data/c7m1mi73/versions/M8PqPQr4/packetfixer-fabric-3.3.4-26.1.2.jar";
+              sha512 = "698020edba2a1fd80bb282bfd4832a00d6447b08eaafbc2e16a8f3bf89e187fc9a622c92dfe94ae140dd485fc0220a86890f12158ec08054e473fef8337829bc";
            };

-            # fork of Modernfix for 1.21.11 (upstream will support 26.1)
+            # mVUS fork: upstream ModernFix no longer ships Fabric builds
            modernfix = fetchurl {
-              url = "https://cdn.modrinth.com/data/TjSm1wrD/versions/JwSO8JCN/modernfix-5.25.2-build.4.jar";
-              sha512 = "0d65c05ac0475408c58ef54215714e6301113101bf98bfe4bb2ba949fbfddd98225ac4e2093a5f9206a9e01ba80a931424b237bdfa3b6e178c741ca6f7f8c6a3";
+              url = "https://cdn.modrinth.com/data/TjSm1wrD/versions/dqQ7mabN/modernfix-5.26.2-build.1.jar";
+              sha512 = "fbef93c2dabf7bcd0ccd670226dfc4958f7ebe5d8c2b1158e88a65e6954a40f595efd58401d2a3dbb224660dca5952199cf64df29100e7bd39b1b1941290b57b";
            };

            debugify = fetchurl {
-              url = "https://cdn.modrinth.com/data/QwxR6Gcd/versions/8Q49lnaU/debugify-1.21.11%2B1.0.jar";
-              sha512 = "04d82dd33f44ced37045f1f9a54ad4eacd70861ff74a8800f2d2df358579e6cb0ea86a34b0086b3e87026b1a0691dd6594b4fdc49f89106466eea840518beb03";
+              url = "https://cdn.modrinth.com/data/QwxR6Gcd/versions/mfTTfiKn/debugify-26.1.2%2B1.0.jar";
+              sha512 = "63db82f2163b9f7fc27ebea999ffcd7a961054435b3ed7d8bf32d905b5f60ce81715916b7fd4e9509dd23703d5492059f3ce7e5f176402f8ed4f985a415553f4";
            };
-
          }
        );
      };
--- a/services/monero/default.nix
+++ b/services/monero/default.nix
@@ -0,0 +1,8 @@
+{
+  imports = [
+    ./monero.nix
+    ./p2pool.nix
+    ./xmrig.nix
+    ./xmrig-auto-pause.nix
+  ];
+}
--- a/services/monero/monero.nix
+++ b/services/monero/monero.nix
--- a/services/monero/p2pool.nix
+++ b/services/monero/p2pool.nix
@@ -4,9 +4,6 @@
  lib,
  ...
 }:
-let
-  walletAddress = lib.strings.trim (builtins.readFile ../secrets/xmrig-wallet);
-in
 {
  imports = [
    (lib.serviceMountWithZpool "p2pool" service_configs.zpool_ssds [
@@ -20,7 +17,7 @@ in
  services.p2pool = {
    enable = true;
    dataDir = service_configs.p2pool.dataDir;
-    walletAddress = walletAddress;
+    walletAddress = service_configs.p2pool.walletAddress;
    sidechain = "nano";
    host = "127.0.0.1";
    rpcPort = service_configs.ports.public.monero_rpc.port;
@@ -36,12 +33,6 @@ in
    wants = [ "monero.service" ];
  };

-  # Stop p2pool on UPS battery to conserve power
-  services.apcupsd.hooks = lib.mkIf config.services.apcupsd.enable {
-    onbattery = "systemctl stop p2pool";
-    offbattery = "systemctl start p2pool";
-  };
-
  networking.firewall.allowedTCPPorts = [
    service_configs.ports.public.p2pool_p2p.port
  ];
--- a/services/monero/xmrig-auto-pause.nix
+++ b/services/monero/xmrig-auto-pause.nix
@@ -0,0 +1,39 @@
+{
+  config,
+  lib,
+  pkgs,
+  ...
+}:
+lib.mkIf config.services.xmrig.enable {
+  systemd.services.xmrig-auto-pause = {
+    description = "Auto-pause xmrig when other services need CPU";
+    after = [ "xmrig.service" ];
+    wantedBy = [ "multi-user.target" ];
+    serviceConfig = {
+      ExecStart = "${pkgs.python3}/bin/python3 ${./xmrig-auto-pause.py}";
+      Restart = "always";
+      RestartSec = "10s";
+      NoNewPrivileges = true;
+      ProtectHome = true;
+      ProtectSystem = "strict";
+      PrivateTmp = true;
+      RestrictAddressFamilies = [
+        "AF_UNIX" # systemctl talks to systemd over D-Bus unix socket
+      ];
+      MemoryDenyWriteExecute = true;
+      StateDirectory = "xmrig-auto-pause";
+    };
+    environment = {
+      POLL_INTERVAL = "3";
+      GRACE_PERIOD = "15";
+      # Background services (qbittorrent, bitmagnet, postgresql, etc.) produce
+      # 15-25% non-nice CPU during normal operation. The stop threshold must
+      # sit above transient spikes; the resume threshold must be below the
+      # steady-state floor to avoid restarting xmrig while services are active.
+      CPU_STOP_THRESHOLD = "40";
+      CPU_RESUME_THRESHOLD = "10";
+      STARTUP_COOLDOWN = "10";
+      STATE_DIR = "/var/lib/xmrig-auto-pause";
+    };
+  };
+}
--- a/services/monero/xmrig-auto-pause.py
+++ b/services/monero/xmrig-auto-pause.py
@@ -0,0 +1,210 @@
+#!/usr/bin/env python3
+"""
+Auto-pause xmrig when other services need CPU.
+
+Monitors non-nice CPU usage from /proc/stat. Since xmrig runs at Nice=19,
+its CPU time lands in the 'nice' column and is excluded from the metric.
+When real workload (user + system + irq + softirq) exceeds the stop
+threshold, stops xmrig. When it drops below the resume threshold for
+GRACE_PERIOD seconds, restarts xmrig.
+
+This replaces per-service pause scripts with a single general-purpose
+monitor that handles any CPU-intensive workload (gitea workers, llama-cpp
+inference, etc.) without needing to know about specific processes.
+
+Why scheduler priority alone isn't enough:
+  Nice=19 / SCHED_IDLE only affects which thread gets the next time slice.
+  RandomX's 2MB-per-thread scratchpad (24MB across 12 threads) pollutes
+  the shared 32MB L3 cache, and its memory access pattern saturates DRAM
+  bandwidth. Other services run slower even though they aren't denied CPU
+  time. The only fix is to stop xmrig entirely when real work is happening.
+
+Hysteresis:
+  The stop threshold is set higher than the resume threshold to prevent
+  oscillation. When xmrig runs, its L3 cache pressure makes other processes
+  appear ~3-8% busier. A single threshold trips on this indirect effect,
+  causing stop/start thrashing. Separate thresholds break the cycle: the
+  resume threshold confirms the system is truly idle, while the stop
+  threshold requires genuine workload above xmrig's indirect pressure.
+"""
+
+import os
+import subprocess
+import sys
+import time
+
+POLL_INTERVAL = int(os.environ.get("POLL_INTERVAL", "3"))
+GRACE_PERIOD = float(os.environ.get("GRACE_PERIOD", "15"))
+# Percentage of total CPU ticks that non-nice processes must use to trigger
+# a pause. On a 12-thread system, one fully loaded core ≈ 8.3% of total.
+# Default 15% requires roughly two busy cores, which avoids false positives
+# from xmrig's L3 cache pressure inflating other processes' apparent CPU.
+CPU_STOP_THRESHOLD = float(os.environ.get("CPU_STOP_THRESHOLD", "15"))
+# Percentage below which the system is considered idle enough to resume
+# mining. Lower than the stop threshold to provide hysteresis.
+CPU_RESUME_THRESHOLD = float(os.environ.get("CPU_RESUME_THRESHOLD", "5"))
+# After starting xmrig, ignore CPU spikes for this many seconds to let
+# RandomX dataset initialization complete (~4s on the target hardware)
+# without retriggering a stop.
+STARTUP_COOLDOWN = float(os.environ.get("STARTUP_COOLDOWN", "10"))
+# Directory for persisting pause state across script restarts.  Without
+# this, a restart while xmrig is paused loses the paused_by_us flag and
+# xmrig stays stopped permanently.
+STATE_DIR = os.environ.get("STATE_DIR", "")
+_PAUSE_FILE = os.path.join(STATE_DIR, "paused") if STATE_DIR else ""
+
+
+def log(msg):
+    print(f"[xmrig-auto-pause] {msg}", file=sys.stderr, flush=True)
+
+
+def read_cpu_ticks():
+    """Read CPU tick counters from /proc/stat.
+
+    Returns (total_ticks, real_work_ticks) where real_work excludes the
+    'nice' column (xmrig) and idle/iowait.
+    """
+    with open("/proc/stat") as f:
+        parts = f.readline().split()
+    # cpu  user nice system idle iowait irq softirq steal
+    user, nice, system, idle, iowait, irq, softirq, steal = (
+        int(x) for x in parts[1:9]
+    )
+    total = user + nice + system + idle + iowait + irq + softirq + steal
+    real_work = user + system + irq + softirq
+    return total, real_work
+
+
+def is_active(unit):
+    """Check if a systemd unit is currently active."""
+    result = subprocess.run(
+        ["systemctl", "is-active", "--quiet", unit],
+        capture_output=True,
+    )
+    return result.returncode == 0
+
+
+def systemctl(action, unit):
+    result = subprocess.run(
+        ["systemctl", action, unit],
+        capture_output=True,
+        text=True,
+    )
+    if result.returncode != 0:
+        log(f"systemctl {action} {unit} failed (rc={result.returncode}): {result.stderr.strip()}")
+    return result.returncode == 0
+
+
+def _save_paused(paused):
+    """Persist pause flag so a script restart can resume where we left off."""
+    if not _PAUSE_FILE:
+        return
+    try:
+        if paused:
+            open(_PAUSE_FILE, "w").close()
+        else:
+            os.remove(_PAUSE_FILE)
+    except OSError:
+        pass
+
+
+def _load_paused():
+    """Check if a previous instance left xmrig paused."""
+    if not _PAUSE_FILE:
+        return False
+    return os.path.isfile(_PAUSE_FILE)
+
+
+def main():
+    paused_by_us = _load_paused()
+    idle_since = None
+    started_at = None  # monotonic time when we last started xmrig
+    prev_total = None
+    prev_work = None
+
+    if paused_by_us:
+        log("Recovered pause state from previous instance")
+
+    log(
+        f"Starting: poll={POLL_INTERVAL}s grace={GRACE_PERIOD}s "
+        f"stop={CPU_STOP_THRESHOLD}% resume={CPU_RESUME_THRESHOLD}% "
+        f"cooldown={STARTUP_COOLDOWN}s"
+    )
+
+    while True:
+        total, work = read_cpu_ticks()
+
+        if prev_total is None:
+            prev_total = total
+            prev_work = work
+            time.sleep(POLL_INTERVAL)
+            continue
+
+        dt = total - prev_total
+        if dt <= 0:
+            prev_total = total
+            prev_work = work
+            time.sleep(POLL_INTERVAL)
+            continue
+
+        real_work_pct = ((work - prev_work) / dt) * 100
+        prev_total = total
+        prev_work = work
+
+        # Don't act during startup cooldown — RandomX dataset init causes
+        # a transient CPU spike that would immediately retrigger a stop.
+        if started_at is not None:
+            if time.monotonic() - started_at < STARTUP_COOLDOWN:
+                time.sleep(POLL_INTERVAL)
+                continue
+            # Cooldown expired — verify xmrig survived startup.  If it
+            # crashed during init (hugepage failure, pool unreachable, etc.),
+            # re-enter the pause/retry cycle rather than silently leaving
+            # xmrig dead.
+            if not is_active("xmrig.service"):
+                log("xmrig died during startup cooldown — will retry")
+                paused_by_us = True
+                _save_paused(True)
+            started_at = None
+
+        above_stop = real_work_pct > CPU_STOP_THRESHOLD
+        below_resume = real_work_pct <= CPU_RESUME_THRESHOLD
+
+        if above_stop:
+            idle_since = None
+            if paused_by_us and is_active("xmrig.service"):
+                # Something else restarted xmrig (deploy, manual start, etc.)
+                # while we thought it was stopped. Reset ownership so we can
+                # manage it again.
+                log("xmrig was restarted externally while paused — reclaiming")
+                paused_by_us = False
+                _save_paused(False)
+            if not paused_by_us:
+                # Only claim ownership if xmrig is actually running.
+                # If something else stopped it (e.g. UPS battery hook),
+                # don't interfere — we'd wrongly restart it later.
+                if is_active("xmrig.service"):
+                    log(f"Real workload detected ({real_work_pct:.1f}% CPU) — stopping xmrig")
+                    if systemctl("stop", "xmrig.service"):
+                        paused_by_us = True
+                        _save_paused(True)
+        elif paused_by_us:
+            if below_resume:
+                if idle_since is None:
+                    idle_since = time.monotonic()
+                elif time.monotonic() - idle_since >= GRACE_PERIOD:
+                    log(f"Workload ended ({real_work_pct:.1f}% CPU) past grace period — starting xmrig")
+                    if systemctl("start", "xmrig.service"):
+                        paused_by_us = False
+                        _save_paused(False)
+                        started_at = time.monotonic()
+                    idle_since = None
+            else:
+                # Between thresholds — not idle enough to resume.
+                idle_since = None
+
+        time.sleep(POLL_INTERVAL)
+
+
+if __name__ == "__main__":
+    main()
--- a/services/monero/xmrig.nix
+++ b/services/monero/xmrig.nix
@@ -11,7 +11,7 @@ in
 {
  services.xmrig = {
    enable = true;
-    package = pkgs.xmrig;
+    package = lib.optimizePackage pkgs.xmrig;

    settings = {
      autosave = true;
--- a/services/ntfy/default.nix
+++ b/services/ntfy/default.nix
@@ -0,0 +1,6 @@
+{
+  imports = [
+    ./ntfy.nix
+    ./ntfy-alerts.nix
+  ];
+}
--- a/services/ntfy/ntfy-alerts.nix
+++ b/services/ntfy/ntfy-alerts.nix
@@ -1,5 +1,10 @@
-{ config, service_configs, ... }:
 {
+  config,
+  lib,
+  service_configs,
+  ...
+}:
+lib.mkIf config.services.ntfy-sh.enable {
  services.ntfyAlerts = {
    enable = true;
    serverUrl = "https://${service_configs.ntfy.domain}";
--- a/services/ntfy/ntfy.nix
+++ b/services/ntfy/ntfy.nix
@@ -12,6 +12,10 @@
    (lib.serviceFilePerms "ntfy-sh" [
      "Z /var/lib/private/ntfy-sh 0700 ${config.services.ntfy-sh.user} ${config.services.ntfy-sh.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      domain = service_configs.ntfy.domain;
+      port = service_configs.ports.private.ntfy.port;
+    })
  ];

  services.ntfy-sh = {
@@ -27,8 +31,4 @@
    };
  };

-  services.caddy.virtualHosts."${service_configs.ntfy.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString service_configs.ports.private.ntfy.port}
-  '';
-
 }
--- a/services/qbittorrent.nix
+++ b/services/qbittorrent.nix
@@ -6,6 +6,11 @@
  inputs,
  ...
 }:
+let
+  categoriesFile = pkgs.writeText "categories.json" (
+    builtins.toJSON (lib.mapAttrs (_: path: { save_path = path; }) service_configs.torrent.categories)
+  );
+in
 {
  imports = [
    (lib.serviceMountWithZpool "qbittorrent" service_configs.zpool_hdds [
@@ -18,10 +23,18 @@
    (lib.serviceFilePerms "qbittorrent" [
      # 0770: group (media) needs write to delete files during upgrades —
      # Radarr/Sonarr must unlink the old file before placing the new one.
-      "Z ${config.services.qbittorrent.serverConfig.Preferences.Downloads.SavePath} 0770 ${config.services.qbittorrent.user} ${service_configs.media_group}"
+      # Non-recursive (z not Z): UMask=0007 ensures new files get correct perms.
+      # A recursive Z rule would walk millions of files on the HDD pool at every boot.
+      "z ${config.services.qbittorrent.serverConfig.Preferences.Downloads.SavePath} 0770 ${config.services.qbittorrent.user} ${service_configs.media_group}"
      "z ${config.services.qbittorrent.serverConfig.Preferences.Downloads.TempPath} 0700 ${config.services.qbittorrent.user} ${config.services.qbittorrent.group}"
      "Z ${config.services.qbittorrent.profileDir} 0700 ${config.services.qbittorrent.user} ${config.services.qbittorrent.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "torrent";
+      port = service_configs.ports.private.torrent.port;
+      auth = true;
+      vpn = true;
+    })
  ];

  services.qbittorrent = {
@@ -135,10 +148,50 @@
    UMask = lib.mkForce "0007";
  };

-  services.caddy.virtualHosts."torrent.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy ${config.vpnNamespaces.wg.namespaceAddress}:${builtins.toString config.services.qbittorrent.webuiPort}
-  '';
+  # Pre-define qBittorrent categories with explicit save paths so every
+  # torrent routes to its category directory instead of the SavePath root.
+  systemd.tmpfiles.settings.qbittorrent-categories = {
+    "${config.services.qbittorrent.profileDir}/qBittorrent/config/categories.json"."L+" = {
+      argument = "${categoriesFile}";
+      user = config.services.qbittorrent.user;
+      group = config.services.qbittorrent.group;
+      mode = "1400";
+    };
+  };
+
+  # Ensure category directories exist with correct ownership before first use.
+  systemd.tmpfiles.rules = lib.mapAttrsToList (
+    _: path: "d ${path} 0770 ${config.services.qbittorrent.user} ${service_configs.media_group} -"
+  ) service_configs.torrent.categories;
+
+  # Periodically checkpoint qBittorrent's SQLite WAL (Write-Ahead Log).
+  # qBittorrent holds a read transaction open for its entire lifetime,
+  # preventing SQLite's auto-checkpoint from running. The WAL grows
+  # unbounded (observed: 405 MB) and must be replayed on next startup,
+  # causing 10+ minute "internal preparations" hangs.
+  # A second sqlite3 connection can checkpoint concurrently and safely.
+  # See: https://github.com/qbittorrent/qBittorrent/issues/20433
+  systemd.services.qbittorrent-wal-checkpoint = {
+    description = "Checkpoint qBittorrent SQLite WAL";
+    after = [ "qbittorrent.service" ];
+    requires = [ "qbittorrent.service" ];
+    serviceConfig = {
+      Type = "oneshot";
+      ExecStart = "${pkgs.sqlite}/bin/sqlite3 ${config.services.qbittorrent.profileDir}/qBittorrent/data/torrents.db 'PRAGMA wal_checkpoint(TRUNCATE);'";
+      User = config.services.qbittorrent.user;
+      Group = config.services.qbittorrent.group;
+    };
+  };
+
+  systemd.timers.qbittorrent-wal-checkpoint = {
+    description = "Periodically checkpoint qBittorrent SQLite WAL";
+    wantedBy = [ "timers.target" ];
+    timerConfig = {
+      OnUnitActiveSec = "4h";
+      OnBootSec = "30min";
+      RandomizedDelaySec = "10min";
+    };
+  };

  users.users.${config.services.qbittorrent.user}.extraGroups = [
    service_configs.media_group
--- a/services/soulseek.nix
+++ b/services/soulseek.nix
@@ -19,6 +19,10 @@
      "Z ${service_configs.slskd.downloads} 0750 ${config.services.slskd.user} music"
      "Z ${service_configs.slskd.incomplete} 0750 ${config.services.slskd.user} music"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "soulseek";
+      port = service_configs.ports.private.soulseek_web.port;
+    })
  ];

  users.groups."music" = { };
@@ -58,11 +62,6 @@
  users.users.${config.services.jellyfin.user}.extraGroups = [ "music" ];
  users.users.${username}.extraGroups = [ "music" ];

-  # doesn't work with auth????
-  services.caddy.virtualHosts."soulseek.${service_configs.https.domain}".extraConfig = ''
-    reverse_proxy :${builtins.toString config.services.slskd.settings.web.port}
-  '';
-
  networking.firewall.allowedTCPPorts = [
    service_configs.ports.public.soulseek_listen.port
  ];
--- a/services/ssh.nix
+++ b/services/ssh.nix
@@ -31,5 +31,8 @@

  # used for deploying configs to server
  users.users.root.openssh.authorizedKeys.keys =
-    config.users.users.${username}.openssh.authorizedKeys.keys;
+    config.users.users.${username}.openssh.authorizedKeys.keys
+    ++ [
+      "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIC5ZYN6idL/w/mUIfPOH1i+Q/SQXuzAMQUEuWpipx1Pc ci-deploy@muffin"
+    ];
 }
--- a/services/syncthing.nix
+++ b/services/syncthing.nix
@@ -17,6 +17,11 @@
      "Z ${service_configs.syncthing.signalBackupDir} 0750 ${config.services.syncthing.user} ${config.services.syncthing.group}"
      "Z ${service_configs.syncthing.grayjayBackupDir} 0750 ${config.services.syncthing.user} ${config.services.syncthing.group}"
    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "syncthing";
+      port = service_configs.ports.private.syncthing_gui.port;
+      auth = true;
+    })
  ];

  services.syncthing = {
@@ -49,9 +54,4 @@
    ];
  };

-  services.caddy.virtualHosts."syncthing.${service_configs.https.domain}".extraConfig = ''
-    import ${config.age.secrets.caddy_auth.path}
-    reverse_proxy :${toString service_configs.ports.private.syncthing_gui.port}
-  '';
-
 }
--- a/services/trilium.nix
+++ b/services/trilium.nix
@@ -0,0 +1,27 @@
+{
+  config,
+  pkgs,
+  service_configs,
+  lib,
+  ...
+}:
+{
+  imports = [
+    (lib.serviceMountWithZpool "trilium-server" service_configs.zpool_ssds [
+      (service_configs.services_dir + "/trilium")
+    ])
+    (lib.mkCaddyReverseProxy {
+      subdomain = "notes";
+      port = service_configs.ports.private.trilium.port;
+      auth = true;
+    })
+  ];
+
+  services.trilium-server = {
+    enable = true;
+    port = service_configs.ports.private.trilium.port;
+    host = "127.0.0.1";
+    dataDir = service_configs.trilium.dataDir;
+  };
+
+}
--- a/tests/fail2ban-jellyfin.nix
+++ b/tests/fail2ban-jellyfin.nix
@@ -30,7 +30,7 @@ let
    { config, pkgs, ... }:
    {
      imports = [
-        (import ../services/jellyfin.nix {
+        (import ../services/jellyfin/jellyfin.nix {
          inherit config pkgs;
          lib = testLib;
          service_configs = testServiceConfigs;
@@ -107,7 +107,7 @@ pkgs.testers.runNixOSTest {
    server.wait_for_unit("jellyfin.service")
    server.wait_for_unit("fail2ban.service")
    server.wait_for_open_port(8096)
-    server.wait_until_succeeds("curl -sf http://localhost:8096/health | grep -q Healthy", timeout=60)
+    server.wait_until_succeeds("curl -sf http://localhost:8096/health | grep -q Healthy", timeout=120)
    time.sleep(2)

    # Wait for Jellyfin to create real log files and reload fail2ban
--- a/tests/gitea-runner.nix
+++ b/tests/gitea-runner.nix
@@ -0,0 +1,60 @@
+{
+  config,
+  lib,
+  pkgs,
+  ...
+}:
+pkgs.testers.runNixOSTest {
+  name = "gitea-runner";
+  nodes.machine =
+    { pkgs, ... }:
+    {
+      services.gitea = {
+        enable = true;
+        database.type = "sqlite3";
+        settings = {
+          server = {
+            HTTP_PORT = 3000;
+            ROOT_URL = "http://localhost:3000";
+            DOMAIN = "localhost";
+          };
+          actions.ENABLED = true;
+          service.DISABLE_REGISTRATION = true;
+        };
+      };
+
+      specialisation.runner = {
+        inheritParentConfig = true;
+        configuration.services.gitea-actions-runner.instances.test = {
+          enable = true;
+          name = "ci";
+          url = "http://localhost:3000";
+          labels = [ "native:host" ];
+          tokenFile = "/var/lib/gitea/runner_token";
+        };
+      };
+    };
+
+  testScript = ''
+    start_all()
+
+    machine.wait_for_unit("gitea.service")
+    machine.wait_for_open_port(3000)
+
+    # Generate runner token
+    machine.succeed(
+        "su -l gitea -s /bin/sh -c '${pkgs.gitea}/bin/gitea actions generate-runner-token --work-path /var/lib/gitea' | tail -1 | sed 's/^/TOKEN=/' > /var/lib/gitea/runner_token"
+    )
+
+    # Switch to runner specialisation
+    machine.succeed(
+        "/run/current-system/specialisation/runner/bin/switch-to-configuration test"
+    )
+
+    # Start the runner (specialisation switch doesn't auto-start new services)
+    machine.succeed("systemctl start gitea-runner-test.service")
+    machine.wait_for_unit("gitea-runner-test.service")
+    machine.succeed("sleep 5")
+    machine.succeed("test -f /var/lib/gitea-runner/test/.runner")
+  '';
+}
--- a/tests/jellyfin-annotations.nix
+++ b/tests/jellyfin-annotations.nix
@@ -0,0 +1,190 @@
+{
+  lib,
+  pkgs,
+  ...
+}:
+let
+  jfLib = import ./jellyfin-test-lib.nix { inherit pkgs lib; };
+  mockGrafana = ./mock-grafana-server.py;
+  script = ../services/grafana/jellyfin-annotations.py;
+  python = pkgs.python3;
+in
+pkgs.testers.runNixOSTest {
+  name = "jellyfin-annotations";
+
+  nodes.machine =
+    { pkgs, ... }:
+    {
+      imports = [ jfLib.jellyfinTestConfig ];
+      environment.systemPackages = [ pkgs.python3 ];
+    };
+
+  testScript = ''
+    import json
+    import time
+
+    import importlib.util
+    _spec = importlib.util.spec_from_file_location("jf_helpers", "${jfLib.helpers}")
+    assert _spec and _spec.loader
+    _jf = importlib.util.module_from_spec(_spec)
+    _spec.loader.exec_module(_jf)
+    setup_jellyfin = _jf.setup_jellyfin
+    jellyfin_api = _jf.jellyfin_api
+
+    GRAFANA_PORT  = 13000
+    ANNOTS_FILE   = "/tmp/annotations.json"
+    STATE_FILE    = "/tmp/annotations-state.json"
+    CREDS_DIR     = "/tmp/test-creds"
+    PYTHON        = "${python}/bin/python3"
+    MOCK_GRAFANA  = "${mockGrafana}"
+    SCRIPT        = "${script}"
+
+    auth_header  = 'MediaBrowser Client="Infuse", DeviceId="test-dev-1", Device="iPhone", Version="1.0"'
+    auth_header2 = 'MediaBrowser Client="Jellyfin Web", DeviceId="test-dev-2", Device="Chrome", Version="1.0"'
+
+    def read_annotations():
+        out = machine.succeed(f"cat {ANNOTS_FILE} 2>/dev/null || echo '[]'")
+        return json.loads(out.strip())
+
+    start_all()
+    token, user_id, movie_id, media_source_id = setup_jellyfin(
+        machine, retry, auth_header,
+        "${jfLib.payloads.auth}", "${jfLib.payloads.empty}",
+    )
+
+    with subtest("Setup mock Grafana and credentials"):
+        machine.succeed(f"mkdir -p {CREDS_DIR}")
+        machine.succeed(f"echo '{token}' > {CREDS_DIR}/jellyfin-api-key")
+        machine.succeed(f"echo '[]' > {ANNOTS_FILE}")
+        machine.succeed(
+            f"systemd-run --unit=mock-grafana {PYTHON} {MOCK_GRAFANA} {GRAFANA_PORT} {ANNOTS_FILE}"
+        )
+        machine.wait_until_succeeds(
+            f"curl -sf -X POST http://127.0.0.1:{GRAFANA_PORT}/api/annotations "
+            f"-H 'Content-Type: application/json' -d '{{\"text\":\"ping\",\"tags\":[]}}' | grep -q id",
+            timeout=10,
+        )
+        machine.succeed(f"echo '[]' > {ANNOTS_FILE}")
+
+    with subtest("Start annotation service"):
+        machine.succeed(
+            f"systemd-run --unit=annotations-svc "
+            f"--setenv=JELLYFIN_URL=http://127.0.0.1:8096 "
+            f"--setenv=GRAFANA_URL=http://127.0.0.1:{GRAFANA_PORT} "
+            f"--setenv=CREDENTIALS_DIRECTORY={CREDS_DIR} "
+            f"--setenv=STATE_FILE={STATE_FILE} "
+            f"--setenv=POLL_INTERVAL=3 "
+            f"{PYTHON} {SCRIPT}"
+        )
+        time.sleep(2)
+
+    with subtest("No annotations when no streams active"):
+        time.sleep(4)
+        annots = read_annotations()
+        assert annots == [], f"Expected no annotations, got: {annots}"
+
+    with subtest("Annotation created when playback starts"):
+        playback_start = json.dumps({
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-play-1",
+            "CanSeek": True,
+            "IsPaused": False,
+        })
+        machine.succeed(
+            f"curl -sf -X POST 'http://localhost:8096/Sessions/Playing' "
+            f"-d '{playback_start}' -H 'Content-Type:application/json' "
+            f"-H 'X-Emby-Authorization:{auth_header}, Token={token}'"
+        )
+        machine.wait_until_succeeds(
+            f"cat {ANNOTS_FILE} | python3 -c \"import sys,json; a=json.load(sys.stdin); exit(0 if a else 1)\"",
+            timeout=15,
+        )
+        annots = read_annotations()
+        assert len(annots) == 1, f"Expected 1 annotation, got: {annots}"
+        text = annots[0]["text"]
+        assert "jellyfin" in annots[0].get("tags", []), f"Missing jellyfin tag: {annots[0]}"
+        assert "Test Movie" in text, f"Missing title in: {text}"
+        assert "Infuse" in text, f"Missing client in: {text}"
+        assert "iPhone" in text, f"Missing device in: {text}"
+        assert "timeEnd" not in annots[0], f"timeEnd should not be set yet: {annots[0]}"
+
+    with subtest("Annotation closed when playback stops"):
+        playback_stop = json.dumps({
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-play-1",
+            "PositionTicks": 50000000,
+        })
+        machine.succeed(
+            f"curl -sf -X POST 'http://localhost:8096/Sessions/Playing/Stopped' "
+            f"-d '{playback_stop}' -H 'Content-Type:application/json' "
+            f"-H 'X-Emby-Authorization:{auth_header}, Token={token}'"
+        )
+        machine.wait_until_succeeds(
+            f"cat {ANNOTS_FILE} | python3 -c \"import sys,json; a=json.load(sys.stdin); exit(0 if a and 'timeEnd' in a[0] else 1)\"",
+            timeout=15,
+        )
+        annots = read_annotations()
+        assert len(annots) == 1, f"Expected 1 annotation, got: {annots}"
+        assert "timeEnd" in annots[0], f"timeEnd should be set: {annots[0]}"
+        assert annots[0]["timeEnd"] > annots[0]["time"], "timeEnd should be after time"
+
+    with subtest("Multiple concurrent streams each get their own annotation"):
+        machine.succeed(f"echo '[]' > {ANNOTS_FILE}")
+
+        auth_result2 = json.loads(machine.succeed(
+            f"curl -sf -X POST 'http://localhost:8096/Users/AuthenticateByName' "
+            f"-d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' "
+            f"-H 'X-Emby-Authorization:{auth_header2}'"
+        ))
+        token2 = auth_result2["AccessToken"]
+
+        playback1 = json.dumps({
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-play-multi-1",
+            "CanSeek": True,
+            "IsPaused": False,
+        })
+        machine.succeed(
+            f"curl -sf -X POST 'http://localhost:8096/Sessions/Playing' "
+            f"-d '{playback1}' -H 'Content-Type:application/json' "
+            f"-H 'X-Emby-Authorization:{auth_header}, Token={token}'"
+        )
+        playback2 = json.dumps({
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-play-multi-2",
+            "CanSeek": True,
+            "IsPaused": False,
+        })
+        machine.succeed(
+            f"curl -sf -X POST 'http://localhost:8096/Sessions/Playing' "
+            f"-d '{playback2}' -H 'Content-Type:application/json' "
+            f"-H 'X-Emby-Authorization:{auth_header2}, Token={token2}'"
+        )
+        machine.wait_until_succeeds(
+            f"cat {ANNOTS_FILE} | python3 -c \"import sys,json; a=json.load(sys.stdin); exit(0 if len(a)==2 else 1)\"",
+            timeout=15,
+        )
+        annots = read_annotations()
+        assert len(annots) == 2, f"Expected 2 annotations, got: {annots}"
+
+    with subtest("State survives service restart (no duplicate annotations)"):
+        machine.succeed("systemctl stop annotations-svc || true")
+        time.sleep(1)
+        machine.succeed(
+            f"systemd-run --unit=annotations-svc-2 "
+            f"--setenv=JELLYFIN_URL=http://127.0.0.1:8096 "
+            f"--setenv=GRAFANA_URL=http://127.0.0.1:{GRAFANA_PORT} "
+            f"--setenv=CREDENTIALS_DIRECTORY={CREDS_DIR} "
+            f"--setenv=STATE_FILE={STATE_FILE} "
+            f"--setenv=POLL_INTERVAL=3 "
+            f"{PYTHON} {SCRIPT}"
+        )
+        time.sleep(6)
+        annots = read_annotations()
+        assert len(annots) == 2, f"Restart should not create duplicates, got: {annots}"
+  '';
+}
--- a/tests/jellyfin-qbittorrent-monitor.nix
+++ b/tests/jellyfin-qbittorrent-monitor.nix
@@ -5,9 +5,21 @@
  ...
 }:
 let
-  payloads = {
-    auth = pkgs.writeText "auth.json" (builtins.toJSON { Username = "jellyfin"; });
-    empty = pkgs.writeText "empty.json" (builtins.toJSON { });
+  jfLib = import ./jellyfin-test-lib.nix { inherit pkgs lib; };
+  webhookPlugin = import ../services/jellyfin/jellyfin-webhook-plugin.nix { inherit pkgs lib; };
+  configureWebhook = webhookPlugin.mkConfigureScript {
+    jellyfinUrl = "http://localhost:8096";
+    webhooks = [
+      {
+        name = "qBittorrent Monitor";
+        uri = "http://127.0.0.1:9898/";
+        notificationTypes = [
+          "PlaybackStart"
+          "PlaybackProgress"
+          "PlaybackStop"
+        ];
+      }
+    ];
  };
 in
 pkgs.testers.runNixOSTest {
@@ -18,11 +30,10 @@ pkgs.testers.runNixOSTest {
      { ... }:
      {
        imports = [
+          jfLib.jellyfinTestConfig
          inputs.vpn-confinement.nixosModules.default
        ];

-        services.jellyfin.enable = true;
-
        # Real qBittorrent service
        services.qbittorrent = {
          enable = true;
@@ -56,11 +67,6 @@ pkgs.testers.runNixOSTest {
          };
        };

-        environment.systemPackages = with pkgs; [
-          curl
-          ffmpeg
-        ];
-        virtualisation.diskSize = 3 * 1024;
        networking.firewall.allowedTCPPorts = [
          8096
          8080
@@ -78,11 +84,30 @@ pkgs.testers.runNixOSTest {
          }
        ];

-        # Create directories for qBittorrent
+        # Create directories for qBittorrent.
        systemd.tmpfiles.rules = [
          "d /var/lib/qbittorrent/downloads 0755 qbittorrent qbittorrent"
          "d /var/lib/qbittorrent/incomplete 0755 qbittorrent qbittorrent"
        ];
+
+        # Install the Jellyfin Webhook plugin before Jellyfin starts, mirroring
+        # the production module. Jellyfin rewrites meta.json at runtime so a
+        # read-only nix-store symlink would fail — we materialise a writable copy.
+        systemd.services."jellyfin-webhook-install" = {
+          description = "Install Jellyfin Webhook plugin files";
+          before = [ "jellyfin.service" ];
+          wantedBy = [ "jellyfin.service" ];
+          serviceConfig = {
+            Type = "oneshot";
+            RemainAfterExit = true;
+            User = "jellyfin";
+            Group = "jellyfin";
+            UMask = "0077";
+            ExecStart = webhookPlugin.mkInstallScript {
+              pluginsDir = "/var/lib/jellyfin/plugins";
+            };
+          };
+        };
      };

    # Public test IP (RFC 5737 TEST-NET-3) so Jellyfin sees it as external
@@ -106,20 +131,17 @@ pkgs.testers.runNixOSTest {
  testScript = ''
    import json
    import time
-    from urllib.parse import urlencode
+
+    import importlib.util
+    _spec = importlib.util.spec_from_file_location("jf_helpers", "${jfLib.helpers}")
+    assert _spec and _spec.loader
+    _jf = importlib.util.module_from_spec(_spec)
+    _spec.loader.exec_module(_jf)
+    setup_jellyfin = _jf.setup_jellyfin
+    jellyfin_api = _jf.jellyfin_api

    auth_header = 'MediaBrowser Client="NixOS Test", DeviceId="test-1337", Device="TestDevice", Version="1.0"'

-    def api_get(path, token=None):
-        header = auth_header + (f", Token={token}" if token else "")
-        return f"curl -sf 'http://server:8096{path}' -H 'X-Emby-Authorization:{header}'"
-
-    def api_post(path, json_file=None, token=None):
-        header = auth_header + (f", Token={token}" if token else "")
-        if json_file:
-            return f"curl -sf -X POST 'http://server:8096{path}' -d '@{json_file}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{header}'"
-        return f"curl -sf -X POST 'http://server:8096{path}' -H 'X-Emby-Authorization:{header}'"
-
    def is_throttled():
        return server.succeed("curl -s http://localhost:8080/api/v2/transfer/speedLimitsMode").strip() == "1"

@@ -137,61 +159,19 @@ pkgs.testers.runNixOSTest {
            return False
        return all(t["state"].startswith("stopped") for t in torrents)

-    movie_id: str = ""
-    media_source_id: str = ""
-
    start_all()
-    server.wait_for_unit("jellyfin.service")
-    server.wait_for_open_port(8096)
-    server.wait_until_succeeds("curl -sf http://localhost:8096/health | grep -q Healthy", timeout=60)
    server.wait_for_unit("qbittorrent.service")
    server.wait_for_open_port(8080)
-
-    # Wait for qBittorrent WebUI to be responsive
    server.wait_until_succeeds("curl -sf http://localhost:8080/api/v2/app/version", timeout=30)

-    with subtest("Complete Jellyfin setup wizard"):
-        server.wait_until_succeeds(api_get("/Startup/Configuration"))
-        server.succeed(api_get("/Startup/FirstUser"))
-        server.succeed(api_post("/Startup/Complete"))
-
-    with subtest("Authenticate and get token"):
-        auth_result = json.loads(server.succeed(api_post("/Users/AuthenticateByName", "${payloads.auth}")))
-        token = auth_result["AccessToken"]
-        user_id = auth_result["User"]["Id"]
-
-    with subtest("Create test video library"):
-        tempdir = server.succeed("mktemp -d -p /var/lib/jellyfin").strip()
-        server.succeed(f"chmod 755 '{tempdir}'")
-        server.succeed(f"ffmpeg -f lavfi -i testsrc2=duration=5 '{tempdir}/Test Movie (2024) [1080p].mkv'")
-
-        add_folder_query = urlencode({
-            "name": "Test Library",
-            "collectionType": "Movies",
-            "paths": tempdir,
-            "refreshLibrary": "true",
-        })
-        server.succeed(api_post(f"/Library/VirtualFolders?{add_folder_query}", "${payloads.empty}", token))
-
-        def is_library_ready(_):
-            folders = json.loads(server.succeed(api_get("/Library/VirtualFolders", token)))
-            return all(f.get("RefreshStatus") == "Idle" for f in folders)
-        retry(is_library_ready, timeout=60)
-
-        def get_movie(_):
-            global movie_id, media_source_id
-            items = json.loads(server.succeed(api_get(f"/Users/{user_id}/Items?IncludeItemTypes=Movie&Recursive=true", token)))
-            if items["TotalRecordCount"] > 0:
-                movie_id = items["Items"][0]["Id"]
-                item_info = json.loads(server.succeed(api_get(f"/Users/{user_id}/Items/{movie_id}", token)))
-                media_source_id = item_info["MediaSources"][0]["Id"]
-                return True
-            return False
-        retry(get_movie, timeout=60)
+    token, user_id, movie_id, media_source_id = setup_jellyfin(
+        server, retry, auth_header,
+        "${jfLib.payloads.auth}", "${jfLib.payloads.empty}",
+    )

    with subtest("Start monitor service"):
        python = "${pkgs.python3.withPackages (ps: [ ps.requests ])}/bin/python"
-        monitor = "${../services/jellyfin-qbittorrent-monitor.py}"
+        monitor = "${../services/jellyfin/jellyfin-qbittorrent-monitor.py}"
        server.succeed(f"""
          systemd-run --unit=monitor-test \
            --setenv=JELLYFIN_URL=http://localhost:8096 \
@@ -214,12 +194,12 @@ pkgs.testers.runNixOSTest {
    server_ip = "192.168.1.1"

    with subtest("Client authenticates from external network"):
-        auth_cmd = f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
+        auth_cmd = f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
        client_auth_result = json.loads(client.succeed(auth_cmd))
        client_token = client_auth_result["AccessToken"]

    with subtest("Second client authenticates from external network"):
-        auth_cmd2 = f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
+        auth_cmd2 = f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
        client_auth_result2 = json.loads(client.succeed(auth_cmd2))
        client_token2 = client_auth_result2["AccessToken"]

@@ -430,7 +410,7 @@ pkgs.testers.runNixOSTest {
    with subtest("Local playback does NOT trigger throttling"):
        local_auth = 'MediaBrowser Client="Local Client", DeviceId="local-1111", Device="LocalDevice", Version="1.0"'
        local_auth_result = json.loads(server.succeed(
-            f"curl -sf -X POST 'http://localhost:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{local_auth}'"
+            f"curl -sf -X POST 'http://localhost:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{local_auth}'"
        ))
        local_token = local_auth_result["AccessToken"]

@@ -448,6 +428,97 @@ pkgs.testers.runNixOSTest {
        local_playback["PositionTicks"] = 50000000
        server.succeed(f"curl -sf -X POST 'http://localhost:8096/Sessions/Playing/Stopped' -d '{json.dumps(local_playback)}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{local_auth}, Token={local_token}'")

+    # === WEBHOOK TESTS ===
+    #
+    # Configure the Jellyfin Webhook plugin to target the monitor, then verify
+    # the real Jellyfin → plugin → monitor path reacts faster than any possible
+    # poll. CHECK_INTERVAL=30 rules out polling as the cause.
+
+    WEBHOOK_PORT = 9898
+    WEBHOOK_CREDS = "/tmp/webhook-creds"
+
+    # Start a webhook-enabled monitor with long poll interval.
+    server.succeed("systemctl stop monitor-test || true")
+    time.sleep(1)
+    server.succeed(f"""
+      systemd-run --unit=monitor-webhook \
+        --setenv=JELLYFIN_URL=http://localhost:8096 \
+        --setenv=JELLYFIN_API_KEY={token} \
+        --setenv=QBITTORRENT_URL=http://localhost:8080 \
+        --setenv=CHECK_INTERVAL=30 \
+        --setenv=STREAMING_START_DELAY=1 \
+        --setenv=STREAMING_STOP_DELAY=1 \
+        --setenv=TOTAL_BANDWIDTH_BUDGET=50000000 \
+        --setenv=SERVICE_BUFFER=2000000 \
+        --setenv=DEFAULT_STREAM_BITRATE=10000000 \
+        --setenv=MIN_TORRENT_SPEED=100 \
+        --setenv=WEBHOOK_PORT={WEBHOOK_PORT} \
+        --setenv=WEBHOOK_BIND=127.0.0.1 \
+        {python} {monitor}
+    """)
+    server.wait_until_succeeds(f"ss -ltn | grep -q ':{WEBHOOK_PORT}'", timeout=15)
+    time.sleep(2)
+    assert not is_throttled(), "Should start unthrottled"
+
+    # Drop the admin token where the configure script expects it (production uses agenix).
+    server.succeed(f"mkdir -p {WEBHOOK_CREDS} && echo '{token}' > {WEBHOOK_CREDS}/jellyfin-api-key")
+    server.succeed(
+        f"systemd-run --wait --unit=webhook-configure-test "
+        f"--setenv=CREDENTIALS_DIRECTORY={WEBHOOK_CREDS} "
+        f"${configureWebhook}"
+    )
+
+    with subtest("Real PlaybackStart event throttles via the plugin"):
+        playback_start = {
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-plugin-start",
+            "CanSeek": True,
+            "IsPaused": False,
+        }
+        start_cmd = f"curl -sf -X POST 'http://{server_ip}:8096/Sessions/Playing' -d '{json.dumps(playback_start)}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}, Token={client_token}'"
+        client.succeed(start_cmd)
+        server.wait_until_succeeds(
+            "curl -sf http://localhost:8080/api/v2/transfer/speedLimitsMode | grep -q '^1$'",
+            timeout=5,
+        )
+        # Let STREAMING_STOP_DELAY (1s) elapse so the upcoming stop is not swallowed by hysteresis.
+        time.sleep(2)
+
+    with subtest("Real PlaybackStop event unthrottles via the plugin"):
+        playback_stop = {
+            "ItemId": movie_id,
+            "MediaSourceId": media_source_id,
+            "PlaySessionId": "test-plugin-start",
+            "PositionTicks": 50000000,
+        }
+        stop_cmd = f"curl -sf -X POST 'http://{server_ip}:8096/Sessions/Playing/Stopped' -d '{json.dumps(playback_stop)}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}, Token={client_token}'"
+        client.succeed(stop_cmd)
+        server.wait_until_succeeds(
+            "curl -sf http://localhost:8080/api/v2/transfer/speedLimitsMode | grep -q '^0$'",
+            timeout=10,
+        )
+
+    # Restore fast-polling monitor for the service-restart tests below.
+    server.succeed("systemctl stop monitor-webhook || true")
+    time.sleep(1)
+    server.succeed(f"""
+      systemd-run --unit=monitor-test \
+        --setenv=JELLYFIN_URL=http://localhost:8096 \
+        --setenv=JELLYFIN_API_KEY={token} \
+        --setenv=QBITTORRENT_URL=http://localhost:8080 \
+        --setenv=CHECK_INTERVAL=1 \
+        --setenv=STREAMING_START_DELAY=1 \
+        --setenv=STREAMING_STOP_DELAY=1 \
+        --setenv=TOTAL_BANDWIDTH_BUDGET=50000000 \
+        --setenv=SERVICE_BUFFER=2000000 \
+        --setenv=DEFAULT_STREAM_BITRATE=10000000 \
+        --setenv=MIN_TORRENT_SPEED=100 \
+        {python} {monitor}
+    """)
+    time.sleep(2)
+
+
    # === SERVICE RESTART TESTS ===

    with subtest("qBittorrent restart during throttled state re-applies throttling"):
@@ -527,11 +598,11 @@ pkgs.testers.runNixOSTest {

        # Re-authenticate (old token invalid after restart)
        client_auth_result = json.loads(client.succeed(
-            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
+            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
        ))
        client_token = client_auth_result["AccessToken"]
        client_auth_result2 = json.loads(client.succeed(
-            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
+            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
        ))
        client_token2 = client_auth_result2["AccessToken"]

@@ -542,11 +613,11 @@ pkgs.testers.runNixOSTest {
    with subtest("Monitor recovers after Jellyfin temporary unavailability"):
        # Re-authenticate with fresh token
        client_auth_result = json.loads(client.succeed(
-            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
+            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth}'"
        ))
        client_token = client_auth_result["AccessToken"]
        client_auth_result2 = json.loads(client.succeed(
-            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
+            f"curl -sf -X POST 'http://{server_ip}:8096/Users/AuthenticateByName' -d '@${jfLib.payloads.auth}' -H 'Content-Type:application/json' -H 'X-Emby-Authorization:{client_auth2}'"
        ))
        client_token2 = client_auth_result2["AccessToken"]

--- a/tests/jellyfin-test-lib.nix
+++ b/tests/jellyfin-test-lib.nix
@@ -0,0 +1,20 @@
+{ pkgs, lib }:
+{
+  payloads = {
+    auth = pkgs.writeText "auth.json" (builtins.toJSON { Username = "jellyfin"; });
+    empty = pkgs.writeText "empty.json" (builtins.toJSON { });
+  };
+
+  helpers = ./jellyfin-test-lib.py;
+
+  jellyfinTestConfig =
+    { pkgs, ... }:
+    {
+      services.jellyfin.enable = true;
+      environment.systemPackages = with pkgs; [
+        curl
+        ffmpeg
+      ];
+      virtualisation.diskSize = lib.mkDefault (3 * 1024);
+    };
+}
--- a/tests/jellyfin-test-lib.py
+++ b/tests/jellyfin-test-lib.py
@@ -0,0 +1,90 @@
+import json
+from urllib.parse import urlencode
+
+
+def jellyfin_api(machine, method, path, auth_header, token=None, data_file=None, data=None):
+    hdr = auth_header + (f", Token={token}" if token else "")
+    cmd = f"curl -sf -X {method} 'http://localhost:8096{path}'"
+    if data_file:
+        cmd += f" -d '@{data_file}' -H 'Content-Type:application/json'"
+    elif data:
+        payload = json.dumps(data) if isinstance(data, dict) else data
+        cmd += f" -d '{payload}' -H 'Content-Type:application/json'"
+    cmd += f" -H 'X-Emby-Authorization:{hdr}'"
+    return machine.succeed(cmd)
+
+
+def setup_jellyfin(machine, retry, auth_header, auth_payload, empty_payload):
+    machine.wait_for_unit("jellyfin.service")
+    machine.wait_for_open_port(8096)
+    machine.wait_until_succeeds(
+        "curl -sf http://localhost:8096/health | grep -q Healthy", timeout=120
+    )
+
+    machine.wait_until_succeeds(
+        f"curl -sf 'http://localhost:8096/Startup/Configuration' "
+        f"-H 'X-Emby-Authorization:{auth_header}'"
+    )
+    jellyfin_api(machine, "GET", "/Startup/FirstUser", auth_header)
+    jellyfin_api(machine, "POST", "/Startup/Complete", auth_header)
+
+    result = json.loads(
+        jellyfin_api(
+            machine, "POST", "/Users/AuthenticateByName",
+            auth_header, data_file=auth_payload,
+        )
+    )
+    token = result["AccessToken"]
+    user_id = result["User"]["Id"]
+
+    tempdir = machine.succeed("mktemp -d -p /var/lib/jellyfin").strip()
+    machine.succeed(f"chmod 755 '{tempdir}'")
+    machine.succeed(
+        f"ffmpeg -f lavfi -i testsrc2=duration=5 -f lavfi -i sine=frequency=440:duration=5 "
+        f"-c:v libx264 -c:a aac '{tempdir}/Test Movie (2024).mkv'"
+    )
+
+    query = urlencode({
+        "name": "Test Library",
+        "collectionType": "Movies",
+        "paths": tempdir,
+        "refreshLibrary": "true",
+    })
+    jellyfin_api(
+        machine, "POST", f"/Library/VirtualFolders?{query}",
+        auth_header, token=token, data_file=empty_payload,
+    )
+
+    def is_ready(_):
+        folders = json.loads(
+            jellyfin_api(machine, "GET", "/Library/VirtualFolders", auth_header, token=token)
+        )
+        return all(f.get("RefreshStatus") == "Idle" for f in folders)
+    retry(is_ready, timeout=60)
+
+    movie_id = None
+    media_source_id = None
+
+    def get_movie(_):
+        nonlocal movie_id, media_source_id
+        items = json.loads(
+            jellyfin_api(
+                machine, "GET",
+                f"/Users/{user_id}/Items?IncludeItemTypes=Movie&Recursive=true",
+                auth_header, token=token,
+            )
+        )
+        if items["TotalRecordCount"] > 0:
+            movie_id = items["Items"][0]["Id"]
+            info = json.loads(
+                jellyfin_api(
+                    machine, "GET", f"/Users/{user_id}/Items/{movie_id}",
+                    auth_header, token=token,
+                )
+            )
+            media_source_id = info["MediaSources"][0]["Id"]
+            return True
+        return False
+    retry(get_movie, timeout=60)
+
+    return token, user_id, movie_id, media_source_id
--- a/tests/mock-grafana-server.py
+++ b/tests/mock-grafana-server.py
@@ -0,0 +1,58 @@
+import http.server, json, sys
+
+PORT = int(sys.argv[1])
+DATA_FILE = sys.argv[2]
+
+class Handler(http.server.BaseHTTPRequestHandler):
+    def log_message(self, fmt, *args):
+        pass
+
+    def _read_body(self):
+        length = int(self.headers.get("Content-Length", 0))
+        return json.loads(self.rfile.read(length)) if length else {}
+
+    def _json(self, code, body):
+        data = json.dumps(body).encode()
+        self.send_response(code)
+        self.send_header("Content-Type", "application/json")
+        self.end_headers()
+        self.wfile.write(data)
+
+    def do_POST(self):
+        if self.path == "/api/annotations":
+            body = self._read_body()
+            try:
+                with open(DATA_FILE) as f:
+                    annotations = json.load(f)
+            except Exception:
+                annotations = []
+            aid = len(annotations) + 1
+            body["id"] = aid
+            annotations.append(body)
+            with open(DATA_FILE, "w") as f:
+                json.dump(annotations, f)
+            self._json(200, {"id": aid, "message": "Annotation added"})
+        else:
+            self.send_response(404)
+            self.end_headers()
+
+    def do_PATCH(self):
+        if self.path.startswith("/api/annotations/"):
+            aid = int(self.path.rsplit("/", 1)[-1])
+            body = self._read_body()
+            try:
+                with open(DATA_FILE) as f:
+                    annotations = json.load(f)
+            except Exception:
+                annotations = []
+            for a in annotations:
+                if a["id"] == aid:
+                    a.update(body)
+            with open(DATA_FILE, "w") as f:
+                json.dump(annotations, f)
+            self._json(200, {"message": "Annotation patched"})
+        else:
+            self.send_response(404)
+            self.end_headers()
+
+http.server.HTTPServer(("127.0.0.1", PORT), Handler).serve_forever()
--- a/tests/tests.nix
+++ b/tests/tests.nix
@@ -22,9 +22,20 @@ in
  fail2banImmichTest = handleTest ./fail2ban-immich.nix;
  fail2banJellyfinTest = handleTest ./fail2ban-jellyfin.nix;

+  # jellyfin annotation service test
+  jellyfinAnnotationsTest = handleTest ./jellyfin-annotations.nix;
+
+  # zfs scrub annotations test
+  zfsScrubAnnotationsTest = handleTest ./zfs-scrub-annotations.nix;
+
+  # xmrig auto-pause test
+  xmrigAutoPauseTest = handleTest ./xmrig-auto-pause.nix;
  # ntfy alerts test
  ntfyAlertsTest = handleTest ./ntfy-alerts.nix;

  # torrent audit test
  torrentAuditTest = handleTest ./torrent-audit.nix;
+
+  # gitea runner test
+  giteaRunnerTest = handleTest ./gitea-runner.nix;
 }
--- a/tests/xmrig-auto-pause.nix
+++ b/tests/xmrig-auto-pause.nix
@@ -0,0 +1,206 @@
+{
+  pkgs,
+  ...
+}:
+let
+  script = ../services/monero/xmrig-auto-pause.py;
+  python = pkgs.python3;
+in
+pkgs.testers.runNixOSTest {
+  name = "xmrig-auto-pause";
+
+  nodes.machine =
+    { pkgs, ... }:
+    {
+      environment.systemPackages = [
+        pkgs.python3
+        pkgs.procps
+      ];
+
+      # Mock xmrig as a nice'd sleep process that can be stopped/started.
+      systemd.services.xmrig = {
+        description = "Mock xmrig miner";
+        serviceConfig = {
+          ExecStart = "${pkgs.coreutils}/bin/sleep infinity";
+          Type = "simple";
+          Nice = 19;
+        };
+        wantedBy = [ "multi-user.target" ];
+      };
+    };
+
+  testScript = ''
+    import time
+
+    PYTHON = "${python}/bin/python3"
+    SCRIPT = "${script}"
+
+    # Tuned for test VMs (1-2 cores).
+    # POLL_INTERVAL=1 keeps detection latency low.
+    # GRACE_PERIOD=5 is long enough to verify "stays stopped" but short
+    # enough that the full test completes in reasonable time.
+    # CPU_STOP_THRESHOLD=20 catches a busy-loop on a 1-2 core VM (50-100%)
+    # without triggering from normal VM noise.
+    # CPU_RESUME_THRESHOLD=10 is the idle cutoff for a 1-2 core VM.
+    POLL_INTERVAL = "1"
+    GRACE_PERIOD = "5"
+    CPU_STOP_THRESHOLD = "20"
+    CPU_RESUME_THRESHOLD = "10"
+    STARTUP_COOLDOWN = "4"
+    STATE_DIR = "/tmp/xap-state"
+    def start_cpu_load(name):
+        """Start a non-nice CPU burn as a transient systemd unit."""
+        machine.succeed(
+            f"systemd-run --unit={name} --property=Type=exec "
+            f"bash -c 'while true; do :; done'"
+        )
+
+    def stop_cpu_load(name):
+        machine.succeed(f"systemctl stop {name}")
+
+    def start_monitor(unit_name):
+        """Start the auto-pause monitor as a transient unit."""
+        machine.succeed(
+            f"systemd-run --unit={unit_name} "
+            f"--setenv=POLL_INTERVAL={POLL_INTERVAL} "
+            f"--setenv=GRACE_PERIOD={GRACE_PERIOD} "
+            f"--setenv=CPU_STOP_THRESHOLD={CPU_STOP_THRESHOLD} "
+            f"--setenv=CPU_RESUME_THRESHOLD={CPU_RESUME_THRESHOLD} "
+            f"--setenv=STARTUP_COOLDOWN={STARTUP_COOLDOWN} "
+            f"--setenv=STATE_DIR={STATE_DIR} "
+            f"{PYTHON} {SCRIPT}"
+        )
+        # Monitor needs two consecutive polls to compute a CPU delta.
+        time.sleep(3)
+        # Monitor needs two consecutive polls to compute a CPU delta.
+        time.sleep(3)
+
+    start_all()
+    machine.wait_for_unit("multi-user.target")
+    machine.wait_for_unit("xmrig.service")
+    machine.succeed(f"mkdir -p {STATE_DIR}")
+
+    with subtest("Start auto-pause monitor"):
+        start_monitor("xmrig-auto-pause")
+
+    with subtest("xmrig stays running while system is idle"):
+        machine.succeed("systemctl is-active xmrig")
+
+    with subtest("xmrig stopped when CPU load appears"):
+        start_cpu_load("cpu-load")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+    with subtest("xmrig remains stopped during grace period after load ends"):
+        stop_cpu_load("cpu-load")
+        # Load just stopped. Grace period is 5s. Check at 2s — well within.
+        time.sleep(2)
+        machine.fail("systemctl is-active xmrig")
+
+    with subtest("xmrig resumes after grace period expires"):
+        # Already idle since previous subtest. Grace period (5s) plus
+        # detection delay (~2 polls) plus startup cooldown (4s) means
+        # xmrig should restart within ~12s.
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=20)
+
+    with subtest("Intermittent load does not cause flapping"):
+        # First load — stop xmrig
+        start_cpu_load("cpu-load-1")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+        stop_cpu_load("cpu-load-1")
+
+        # Brief idle gap — shorter than grace period
+        time.sleep(2)
+
+        # Second load arrives before grace period expires
+        start_cpu_load("cpu-load-2")
+        time.sleep(3)
+
+        # xmrig must still be stopped
+        machine.fail("systemctl is-active xmrig")
+
+        stop_cpu_load("cpu-load-2")
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=20)
+
+    with subtest("Sustained load keeps xmrig stopped"):
+        start_cpu_load("cpu-load-3")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+        # Stay busy longer than the grace period to prove continuous
+        # activity keeps xmrig stopped indefinitely.
+        time.sleep(8)
+        machine.fail("systemctl is-active xmrig")
+
+        stop_cpu_load("cpu-load-3")
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=20)
+
+    with subtest("External restart detected and re-stopped under load"):
+        # Put system under load so auto-pause stops xmrig.
+        start_cpu_load("cpu-load-4")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+        # Something external starts xmrig while load is active.
+        # The script should detect this and re-stop it.
+        machine.succeed("systemctl start xmrig")
+        machine.succeed("systemctl is-active xmrig")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+        stop_cpu_load("cpu-load-4")
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=20)
+
+    # --- State persistence and crash recovery ---
+    machine.succeed("systemctl stop xmrig-auto-pause")
+
+    with subtest("xmrig recovers after crash during startup cooldown"):
+        machine.succeed(f"rm -rf {STATE_DIR} && mkdir -p {STATE_DIR}")
+        start_monitor("xmrig-auto-pause-crash")
+
+        # Load -> xmrig stops
+        start_cpu_load("cpu-crash")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+        # End load -> xmrig restarts after grace period
+        stop_cpu_load("cpu-crash")
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=30)
+
+        # Kill xmrig immediately — simulates crash during startup cooldown.
+        # The script should detect the failure when cooldown expires and
+        # re-enter the retry cycle.
+        machine.succeed("systemctl kill --signal=KILL xmrig")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=5)
+
+        # After cooldown + grace period + restart, xmrig should be back.
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=30)
+
+        machine.succeed("systemctl stop xmrig-auto-pause-crash")
+        machine.succeed("systemctl reset-failed xmrig.service || true")
+        machine.succeed("systemctl start xmrig")
+        machine.wait_for_unit("xmrig.service")
+
+    with subtest("Script restart preserves pause state"):
+        machine.succeed(f"rm -rf {STATE_DIR} && mkdir -p {STATE_DIR}")
+        start_monitor("xmrig-auto-pause-persist")
+
+        # Load -> xmrig stops
+        start_cpu_load("cpu-persist")
+        machine.wait_until_fails("systemctl is-active xmrig", timeout=20)
+
+        # Kill the monitor while xmrig is paused (simulates script crash)
+        machine.succeed("systemctl stop xmrig-auto-pause-persist")
+
+        # State file must exist — the monitor persisted the pause flag
+        machine.succeed(f"test -f {STATE_DIR}/paused")
+
+        # Start a fresh monitor instance (reads state file on startup)
+        start_monitor("xmrig-auto-pause-persist2")
+
+        # End load — the new monitor should pick up the paused state
+        # and restart xmrig after the grace period
+        stop_cpu_load("cpu-persist")
+        machine.wait_until_succeeds("systemctl is-active xmrig", timeout=30)
+
+        # State file should be cleaned up after successful restart
+        machine.fail(f"test -f {STATE_DIR}/paused")
+
+        machine.succeed("systemctl stop xmrig-auto-pause-persist2")
+  '';
+}
--- a/tests/zfs-scrub-annotations.nix
+++ b/tests/zfs-scrub-annotations.nix
@@ -0,0 +1,123 @@
+{
+  lib,
+  pkgs,
+  ...
+}:
+let
+  mockServer = ./mock-grafana-server.py;
+
+  mockZpool = pkgs.writeShellScript "zpool" ''
+    case "$1" in
+      list)
+        echo "tank"
+        echo "hdds"
+        ;;
+      status)
+        pool="$2"
+        if [ "$pool" = "tank" ]; then
+          echo "  scan: scrub repaired 0B in 00:24:39 with 0 errors on Mon Jan  1 02:24:39 2024"
+        elif [ "$pool" = "hdds" ]; then
+          echo "  scan: scrub repaired 0B in 04:12:33 with 0 errors on Mon Jan  1 06:12:33 2024"
+        fi
+        ;;
+    esac
+  '';
+
+  script = ../services/grafana/zfs-scrub-annotations.sh;
+  python = pkgs.python3;
+in
+pkgs.testers.runNixOSTest {
+  name = "zfs-scrub-annotations";
+
+  nodes.machine =
+    { pkgs, ... }:
+    {
+      environment.systemPackages = with pkgs; [
+        python3
+        curl
+        jq
+      ];
+    };
+
+  testScript = ''
+    import json
+
+    GRAFANA_PORT = 13000
+    ANNOTS_FILE  = "/tmp/annotations.json"
+    STATE_DIR    = "/tmp/scrub-state"
+    PYTHON       = "${python}/bin/python3"
+    MOCK         = "${mockServer}"
+    SCRIPT       = "${script}"
+    MOCK_ZPOOL   = "${mockZpool}"
+
+    MOCK_BIN = "/tmp/mock-bin"
+    ENV_PREFIX = (
+        f"GRAFANA_URL=http://127.0.0.1:{GRAFANA_PORT} "
+        f"STATE_DIR={STATE_DIR} "
+        f"PATH={MOCK_BIN}:$PATH "
+    )
+
+    def read_annotations():
+        out = machine.succeed(f"cat {ANNOTS_FILE} 2>/dev/null || echo '[]'")
+        return json.loads(out.strip())
+
+    start_all()
+    machine.wait_for_unit("multi-user.target")
+
+    with subtest("Setup state directory and mock zpool"):
+        machine.succeed(f"mkdir -p {STATE_DIR}")
+        machine.succeed(f"mkdir -p {MOCK_BIN} && cp {MOCK_ZPOOL} {MOCK_BIN}/zpool && chmod +x {MOCK_BIN}/zpool")
+
+    with subtest("Start mock Grafana server"):
+        machine.succeed(f"echo '[]' > {ANNOTS_FILE}")
+        machine.succeed(
+            f"systemd-run --unit=mock-grafana {PYTHON} {MOCK} {GRAFANA_PORT} {ANNOTS_FILE}"
+        )
+        machine.wait_until_succeeds(
+            f"curl -sf -X POST http://127.0.0.1:{GRAFANA_PORT}/api/annotations "
+            f"-H 'Content-Type: application/json' -d '{{\"text\":\"ping\",\"tags\":[]}}' | grep -q id",
+            timeout=10,
+        )
+        machine.succeed(f"echo '[]' > {ANNOTS_FILE}")
+
+    with subtest("Start action creates annotation with pool names and zfs-scrub tag"):
+        machine.succeed(f"{ENV_PREFIX} bash {SCRIPT} start")
+        annots = read_annotations()
+        assert len(annots) == 1, f"Expected 1 annotation, got: {annots}"
+        assert "zfs-scrub" in annots[0].get("tags", []), f"Missing zfs-scrub tag: {annots[0]}"
+        assert "tank" in annots[0]["text"], f"Missing tank in text: {annots[0]['text']}"
+        assert "hdds" in annots[0]["text"], f"Missing hdds in text: {annots[0]['text']}"
+        assert "time" in annots[0], f"Missing time field: {annots[0]}"
+        assert "timeEnd" not in annots[0], f"timeEnd should not be set yet: {annots[0]}"
+
+    with subtest("State file contains annotation ID"):
+        ann_id = machine.succeed(f"cat {STATE_DIR}/annotation-id").strip()
+        assert ann_id == "1", f"Expected annotation ID 1, got: {ann_id}"
+
+    with subtest("Stop action closes annotation with per-pool scrub results"):
+        machine.succeed(f"{ENV_PREFIX} bash {SCRIPT} stop")
+        annots = read_annotations()
+        assert len(annots) == 1, f"Expected 1 annotation, got: {annots}"
+        assert "timeEnd" in annots[0], f"timeEnd should be set: {annots[0]}"
+        assert annots[0]["timeEnd"] > annots[0]["time"], "timeEnd should be after time"
+        text = annots[0]["text"]
+        assert "ZFS scrub completed" in text, f"Missing completed text: {text}"
+        assert "tank:" in text, f"Missing tank results: {text}"
+        assert "hdds:" in text, f"Missing hdds results: {text}"
+        assert "00:24:39" in text, f"Missing tank scrub duration: {text}"
+        assert "04:12:33" in text, f"Missing hdds scrub duration: {text}"
+
+    with subtest("State file cleaned up after stop"):
+        machine.fail(f"test -f {STATE_DIR}/annotation-id")
+
+    with subtest("Stop action handles missing state file gracefully"):
+        machine.succeed(f"{ENV_PREFIX} bash {SCRIPT} stop")
+        annots = read_annotations()
+        assert len(annots) == 1, f"Expected no new annotations, got: {annots}"
+
+    with subtest("Start action handles Grafana being down gracefully"):
+        machine.succeed("systemctl stop mock-grafana")
+        machine.succeed(f"{ENV_PREFIX} bash {SCRIPT} start")
+        machine.fail(f"test -f {STATE_DIR}/annotation-id")
+  '';
+}