Files
nixos/AGENTS.md
Simon Gardling c3cc94a305
Some checks failed
Build and Deploy / mreow (push) Successful in 1h37m19s
Build and Deploy / muffin (push) Has been cancelled
Build and Deploy / yarn (push) Has been cancelled
merge common-*.nix files
2026-04-22 18:02:05 -04:00

13 KiB

AGENTS.md

Project Overview

Unified NixOS flake for three hosts:

Host Role nixpkgs channel Activation
mreow Framework 13 AMD AI 300 laptop (niri, greetd, swaylock) nixos-unstable ./deploy.sh locally
yarn AMD Zen 5 desktop (niri + Jovian-NixOS Steam deck mode, impermanence) nixos-unstable pull from CI binary cache
muffin AMD Zen 3 server (Caddy, ZFS, agenix, deploy-rs, 25+ services) nixos-25.11 deploy-rs from CI

One flake.nix declares both channels (nixpkgs and nixpkgs-stable) and composes each host from the correct channel. No single-channel migration is intended.

History pre-dating this repo lives in the merged subtree branches from dotfiles (commit e9a44f6) and server-config (commit 4bc5d57). Use git log <path> (without --follow) and traverse back through the merge commits dc481c2 and 6448a04 for pre-unify history.

Layout

flake.nix                  # 3 hosts, 2 channels
deploy.sh                  # wrapper: current-host rebuild or `muffin` deploy-rs
hosts/<host>/              # host entrypoints (default.nix, home.nix, disk.nix, …)
modules/                   # flat namespace; see module naming below
  common.nix               # imported by ALL hosts (nix settings, doas, fish shim)
  desktop-*.nix            # imported by mreow/yarn only
  server-*.nix             # imported by muffin only
  <bare>.nix               # scoped by filename (age-secrets, zfs, no-rgb, …)
home/
  profiles/{gui,desktop,no-gui}.nix   # home-manager profiles
  progs/<program>.nix                 # one file per program (fish, helix, niri, zen/, emacs, …)
  util/<helper>.nix                   # small derivations
services/                  # muffin-only: caddy, jellyfin, gitea, matrix, monero, …
tests/                     # pkgs.testers.runNixOSTest suite
lib/
  default.nix              # extends nixpkgs-stable.lib with mkCaddyReverseProxy, serviceMountWithZpool, …
  overlays.nix             # jellyfin-exporter, igpu-exporter, reflac, ensureZfsMounts
patches/nixpkgs/           # applied to nixpkgs-stable for muffin builds
secrets/
  desktop/                 # git-crypt: mreow + yarn share these (wifi, nix-cache-netrc, secureboot.tar, password-hash, disk-password)
  home/                    # git-crypt: per-user HM secrets (api keys, steam id)
  server/                  # agenix *.age + git-crypt *.nix/*.tar/livekit_keys
  usb-secrets/             # USB-resident agenix identity key (git-crypt inside the repo)

Never read or write files under secrets/. They are encrypted at rest (git-crypt for plaintext, agenix for .age). The git-crypt key is delivered to muffin at runtime as /run/agenix/git-crypt-key-nixos.age.

Build & Deploy

# --- from any host ---
nix fmt                                                          # nixfmt-tree
nix flake update                                                 # bump both channels + inputs
nix flake update --input-name nixpkgs                            # bump just desktops' channel
nix flake update --input-name nixpkgs-stable                     # bump just muffin's channel

# --- per-host eval / build (add -L for verbose logs) ---
nix build .#nixosConfigurations.mreow.config.system.build.toplevel -L
nix build .#nixosConfigurations.yarn.config.system.build.toplevel -L
nix build .#nixosConfigurations.muffin.config.system.build.toplevel -L

# --- quick eval without building ---
nix eval .#nixosConfigurations.muffin.config.system.build.toplevel --no-build 2>&1 | head -5

# --- activate on current host (mreow / yarn only) ---
./deploy.sh                 # boot (default; next reboot)
./deploy.sh switch          # apply immediately
./deploy.sh test            # apply without boot entry
./deploy.sh build           # build only

# --- deploy to muffin from anywhere ---
./deploy.sh muffin
# equivalent to:
nix run .#deploy -- .#muffin

# --- tests (muffin) ---
nix build .#packages.x86_64-linux.tests -L        # all tests (slow)
nix build .#test-zfsTest -L                        # one test by name
# test names are the keys of tests/tests.nix; pattern is test-<name>

No unit tests for desktop configs. Validation is the nix build exit code plus the successful nix-diff against the previous generation.

If Nix complains about a missing file, git add it first — flakes only see tracked files.

Module naming

Prefix Meaning Example
common- imported by ALL hosts common-doas.nix, common-nix.nix, common-shell-fish.nix
desktop- imported by mreow + yarn only desktop-common.nix, desktop-steam.nix, desktop-networkmanager.nix
server- imported by muffin only server-security.nix, server-power.nix, server-impermanence.nix, server-lanzaboote-agenix.nix
(none) host-specific filename-scoped; see file contents age-secrets.nix, zfs.nix, no-rgb.nix (yarn + muffin)

New modules: pick the narrowest prefix that's true, then add the import explicitly in the host's default.nix (there is no auto-discovery).

Code style

  • Formatter: nixfmt-tree (declared in flake.nix). Run nix fmt before every commit.
  • Indentation: 2 spaces, enforced by the formatter.
  • Function args: one per line, trailing comma, always end with ...:
    {
      config,
      lib,
      pkgs,
      username,
      ...
    }:
    
  • Imports: relative paths, one per line. Use the ../../modules/ style from hosts/; do not invent new aggregator modules unless more than one host uses the aggregation.
  • Package paths: lib.getExe pkgs.foo over "${pkgs.foo}/bin/foo" when the derivation declares meta.mainProgram.
  • Unfree packages: allowlisted per-module via nixpkgs.config.allowUnfreePredicate. Do not add a global permit.
  • Comments: lowercase, # style. Use # TODO! / # BUG! / # FIX: prefixes for known issues that should be searchable.
  • No trailing commas (Nix syntax forbids them).
  • lib.mkDefault / lib.mkForce: prefer mkDefault in shared modules so hosts can override without fighting priority; use mkForce only to beat inherited defaults you can't reach any other way.

Secrets

  • git-crypt covers secrets/** per the root .gitattributes. Initialized with a single symmetric key checked into secrets/server/git-crypt-key-nixos.age (agenix-encrypted to the USB SSH identity).
  • agenix decrypts secrets/server/*.age at activation into /run/agenix/ on muffin.
  • USB identity: /mnt/usb-secrets/usb-secrets-key on muffin; the age identity path is wired in modules/usb-secrets.nix.
  • Encrypting a new agenix secret uses the SSH public key directly with age -R:
    age -R <(ssh-keygen -y -f secrets/usb-secrets/usb-secrets-key) \
        -o secrets/server/<name>.age \
        /path/to/plaintext
    
  • DO NOT use ssh-to-age. It produces X25519 recipient stanzas, which the SSH private key on muffin cannot decrypt (it only decrypts ssh-ed25519 stanzas produced by age -R against the SSH pubkey). Mismatched stanzas show up as age: error: no identity matched any of the recipients at deploy time.
  • Never read or commit plaintext secrets. Never log secret values.

Service pattern (muffin)

Each file under services/ follows this shape:

  1. imports block with lib.serviceMountWithZpool and (optionally) lib.serviceFilePerms.
  2. The service configuration (services.<name> = { … }).
  3. Caddy reverse-proxy vhost (usually via lib.mkCaddyReverseProxy in lib/default.nix).
  4. Firewall rules (networking.firewall.allowed{TCP,UDP}Ports) if externally reachable.
  5. services.fail2ban.jails.<name> if the service authenticates users.

Custom lib helpers (in lib/default.nix) to prefer over reinventing:

  • lib.serviceMountWithZpool <service> <zpool> [dirs]
  • lib.serviceFilePerms <service> [tmpfilesRules]
  • lib.optimizePackage <pkg> — applies -O3 -march=znver3 -mtune=znver3
  • lib.vpnNamespaceOpenPort <port> <service> — confines service to the WireGuard namespace
  • lib.mkCaddyReverseProxy { subdomain|domain, port, auth ? false, vpn ? false }
  • lib.mkFail2banJail { name, unitName ? "${name}.service", failregex }
  • lib.mkGrafanaAnnotationService { name, description, script, after ? [], environment ? {}, loadCredential ? null }
  • lib.extractArrApiKey <configXmlPath> — shell snippet to read the <ApiKey> element

Hard requirements that are asserted at eval time:

  • Port uniqueness: every port in hosts/muffin/service-configs.nix ports.{public,private} must be unique. The flake asserts this.
  • Public/private segregation: public ports must appear in the firewall allow-list; private ports must not. The flake asserts both directions.
  • Hugepages: services that need 2 MiB hugepages declare their budget in service-configs.nix under hugepages_2m.services. The vm.nr_hugepages sysctl is derived from the total.
  • PostgreSQL-first: any service that supports PostgreSQL uses it (via peer-auth Unix socket when possible). Per-service Sqlite (or similar) is not liked.

Deploy guard (muffin)

modules/server-deploy-guard.nix aggregates per-service "is anyone using this right now?" checks into a single deploy-guard-check binary on muffin. Enforcement is preflight-only — the guard runs over SSH before deploy-rs is invoked; activation itself is never gated. This matters because deploy-rs sets the new profile pointer before running the activation script, so a failed activation triggers auto-rollback which re-runs switch-to-configuration on the previous generation — that re-activation rotates agenix secrets, reinstalls lanzaboote, and reloads systemd units. The only safe place to stop a deploy is before deploy-rs starts.

Two drivers invoke the preflight:

  • ./deploy.sh muffin SSHes to server-public and runs deploy-guard-check. SSH connection failure is a hard abort (rc=255) because there is no second gate. ./deploy.sh muffin --force (or DEPLOY_GUARD_FORCE=1 ./deploy.sh muffin) skips the preflight entirely.
  • CI (.gitea/workflows/deploy.yml) has a Deploy guard preflight step between Build muffin and Deploy via deploy-rs. A non-zero exit fails the job before any closure copy or activation.

Adding a new check

In the service's own file (or a sibling <service>-deploy-guard.nix):

{ config, lib, pkgs, ... }:
let
  check = pkgs.writeShellApplication {
    name = "deploy-guard-check-<service>";
    runtimeInputs = [ /* curl, jq, etc. */ ];
    text = ''
      # exit 0 when the service is idle / unreachable (soft-fail)
      # exit 1 with a reason on stdout/stderr when live users would be disrupted
    '';
  };
in
lib.mkIf config.services.<service>.enable {
  services.deployGuard.checks.<service> = {
    description = "Active <service> users";
    command = check;
  };
}

Existing registrations live in services/jellyfin/jellyfin-deploy-guard.nix (REST /Sessions via curl+jq) and services/minecraft-deploy-guard.nix (Server List Ping via mcstatus). Prefer soft-fail on unreachable — a service that's already down has no users to disrupt.

Technical details

  • Privilege escalation: doas everywhere; sudo is disabled on every host.
  • Shell: fish. bash login shells re-exec into fish via programs.bash.interactiveShellInit (see modules/common-shell-fish.nix).
  • Secure boot: lanzaboote. Desktops extract keys from secrets/desktop/secureboot.tar; muffin extracts from an agenix-decrypted tar (see modules/server-lanzaboote-agenix.nix).
  • Impermanence: muffin is tmpfs-root with /persistent surviving reboots (modules/server-impermanence.nix); yarn binds /home/primary from /persistent (hosts/yarn/impermanence.nix).
  • Disks: disko.
  • Binary cache: muffin runs harmonia; desktops consume it at https://nix-cache.sigkill.computer.
  • Kernel:
    • Desktops: linux-cachyos-bore-lto, processorOpt = "x86_64-v3" (see modules/desktop-common.nix — also trims ~80 legacy subsystems).
    • muffin: linuxPackages_6_12 (pinned; 6.18 has a ZFS deadlock in dbuf_evict).
  • Domain: sigkill.computer. The old gardling.com redirects automatically.

Agent-specific instructions

  • If instructed to commit, disable GPG signing (git commit --no-gpg-sign). The author's GPG key is not available in this environment.
  • Use nix-shell -p <package> if a tool is missing from the environment.
  • For nix build, always append -L for verbose logs.
  • If Nix reports a missing file, run git add <file> first — flakes only see git-tracked files.
  • Do not read files under secrets/.
  • Run nix fmt after editing any .nix file.
  • Validate every change with nix build .#nixosConfigurations.<host>.config.system.build.toplevel -L.
  • Commit messages are terse, lowercase; prefix with <scope>: when narrowly scoped (caddy: add redirect, zfs: remove unneeded options, mreow: bump kernel). Generic changes use update or a short description.