Files
nixos/AGENTS.md

140 lines
6.2 KiB
Markdown

# AGENTS.md - server-config (NixOS server "muffin")
## Overview
NixOS flake-based server configuration for host **muffin** (deployed to `root@server-public`).
Uses deploy-rs for remote deployment, disko for disk management, impermanence (tmpfs root),
agenix for secrets, lanzaboote for secure boot, and ZFS for data storage.
## Target Hardware
- **CPU**: AMD Ryzen 5 5600X (6C/12T, Zen 3 / `znver3`)
- **RAM**: 64 GB DDR4, no swap
- **Motherboard**: ASRock B550M Pro4
- **Boot drive**: WD_BLACK SN770 1TB NVMe (f2fs: 20G /persistent, 911G /nix; root is tmpfs)
- **SSD pool `tank`**: 4x 2TB SATA SSDs (raidz2) -- services, backups, music, misc
- **HDD pool `hdds`**: 4x 18TB Seagate Exos X18 (raidz1)-- torrents
- Connected via esata to external enclosure
- **USB**: 8GB VFAT drive mounted at /mnt/usb-secrets (agenix identity key)
- **GPU**: Intel (integrated, xe driver) -- used for Jellyfin hardware transcoding
- **NIC**: enp4s0 (static 192.168.1.50/24)
## Build / Deploy / Test Commands
```bash
# Format code (nixfmt-tree)
nix fmt
# Build the system configuration (check for eval errors)
nix build .#nixosConfigurations.muffin.config.system.build.toplevel -L
# Deploy to server
nix run .#deploy -- .#muffin
# Run ALL tests (NixOS VM tests, takes a long time)
nix build .#packages.x86_64-linux.tests -L
# Run a SINGLE test by name (preferred during development)
nix build .#test-zfsTest -L
nix build .#test-testTest -L
nix build .#test-fail2banSshTest -L
nix build .#test-ntfyAlertsTest -L
nix build .#test-filePermsTest -L
# Pattern: nix build .#test-<testName> -L
# Test names are defined in tests/tests.nix (keys of the returned attrset)
# Check flake outputs (list what's available)
nix flake show
# Evaluate without building (fast syntax/eval check)
nix eval .#nixosConfigurations.muffin.config.system.build.toplevel --no-build 2>&1 | head -5
```
## Code Style
### Nix Formatting
- **Formatter**: `nixfmt-tree` (declared in flake.nix). Always run `nix fmt` before committing.
- **Indentation**: 2 spaces (enforced by nixfmt-tree).
### Module Pattern
Every `.nix` file is a function taking an attrset with named args and `...`:
```nix
{
config,
lib,
pkgs,
service_configs,
...
}:
{
# module body
}
```
- Function args on separate lines, one per line, with trailing comma.
- Opening brace on its own line for multi-line arg lists.
- Use `service_configs` (from `service-configs.nix`) for all ports, paths, domains -- never hardcode.
### Service File Convention
Each service file in `services/` follows this structure:
1. `imports` block with `lib.serviceMountWithZpool` and optionally `lib.serviceFilePerms`
2. Service configuration (`services.<name> = { ... }`)
3. Caddy reverse proxy vhost (`services.caddy.virtualHosts."subdomain.${service_configs.https.domain}"`)
4. Firewall rules if needed (`networking.firewall.allowed{TCP,UDP}Ports`)
5. fail2ban jail if the service has authentication (`services.fail2ban.jails.<name>`)
### Custom Lib Functions (modules/lib.nix)
- `lib.serviceMountWithZpool serviceName zpoolName [dirs]` -- ensures ZFS datasets are mounted before service starts, validates pool membership
- `lib.serviceFilePerms serviceName [tmpfilesRules]` -- sets file permissions via systemd-tmpfiles before service starts
- `lib.optimizePackage pkg` -- applies `-O3 -march=znver3 -mtune=znver3` compiler flags
- `lib.vpnNamespaceOpenPort port serviceName` -- confines service to WireGuard VPN namespace
### Naming Conventions
- **Files**: lowercase with hyphens (`jellyfin-qbittorrent-monitor.nix`)
- **Test names**: camelCase with `Test` suffix in `tests/tests.nix` (`fail2banSshTest`, `zfsTest`)
- **Ports**: all declared in `service-configs.nix` under `ports.*`, referenced as `service_configs.ports.<name>`
- **ZFS datasets**: `tank/services/<name>` for SSD-backed, `hdds/services/<name>` for HDD-backed
- **Commit messages**: terse, lowercase; prefix with service/module name when scoped (`caddy: add redirect`, `zfs: remove unneeded options`). Generic changes use `update` or short description.
### Secrets
- **git-crypt**: `secrets/` directory and `usb-secrets/usb-secrets-key*` are encrypted (see `.gitattributes`)
- **agenix**: secrets declared in `modules/age-secrets.nix`, decrypted at runtime to `/run/agenix/`
- **Identity**: USB drive at `/mnt/usb-secrets/usb-secrets-key`
- **Encrypting new secrets**: The agenix encryption key is in `usb-secrets/usb-secrets-key` (SSH private key, git-crypt encrypted). To create a new secret: derive the age public key with `ssh-keygen -y -f usb-secrets/usb-secrets-key | ssh-to-age`, then encrypt with `age -r <public-key> -o secrets/<name>.age`.
- Never read or commit plaintext secrets. Never log secret values.
### Important Patterns
- **Impermanence**: Root `/` is tmpfs. Only `/persistent`, `/nix`, and ZFS mounts survive reboots. Any new persistent state must be declared in `modules/impermanence.nix`.
- **Port uniqueness**: `flake.nix` has an assertion that all ports in `service_configs.ports` are unique. Always add new ports there. Make sure to put them in the specific "Public" and "Private" sections that are seperated by comments.
- **Hugepages**: Services needing large pages declare their budget in `service-configs.nix` under `hugepages_2m.services`. The kernel sysctl is set automatically from the total.
- **Domain**: Primary domain is `sigkill.computer`. Old domain `gardling.com` redirects automatically.
- **Hardened kernel**: Uses `_hardened` kernel. Security-sensitive defaults apply.
### Test Pattern
Tests use `pkgs.testers.runNixOSTest` (NixOS VM tests):
```nix
{ config, lib, pkgs, ... }:
pkgs.testers.runNixOSTest {
name = "descriptive-test-name";
nodes.machine = { pkgs, ... }: {
imports = [ /* modules under test */ ];
# VM config
};
testScript = ''
start_all()
machine.wait_for_unit("multi-user.target")
# Python test script using machine.succeed/machine.fail
'';
}
```
- Register new tests in `tests/tests.nix` with `handleTest ./filename.nix`
- Tests needing the overlay should use `pkgs.appendOverlays [ (import ../modules/overlays.nix) ]`
- Test scripts are Python; use `machine.succeed(...)`, `machine.fail(...)`, `assert`, `subtest`
## SSH Access
```bash
ssh root@server-public # deploy user
ssh primary@server-public # normal user (doas instead of sudo)
```