Two bugs found during live verification on the server:
1. Stuck state after external restart: if something else restarted xmrig
(e.g. deploy-rs activation) while paused_by_us=True, the script never
detected this and became permanently stuck — unable to stop xmrig on
future load because it thought xmrig was already stopped.
Fix: when paused_by_us=True and busy, check if xmrig is actually
running. If so, reset paused_by_us=False and re-stop it.
2. Flapping on xmrig restart: RandomX dataset init takes ~3.7s of intense
non-nice CPU, which the script detected as real workload and immediately
re-stopped xmrig after every restart, creating a start-stop loop.
Fix: add STARTUP_COOLDOWN (default 10s) — after starting xmrig, skip
CPU checks until the cooldown expires.
Both bugs were present in production: the script had been stuck since
Apr 3 (2+ days) with xmrig running unmanaged alongside llama-server.
Poll /slots endpoint, create annotations when slots start processing,
close with token count when complete. Includes NixOS VM test with
mock llama-cpp and grafana servers. Dashboard annotation entry added.
1. Smoothed out power draw
- UPS only reports on 9 watt intervals, so smoothing it out gives more
relative detail on trends
2. Add jellyfin integration
- Good for seeing correlations between statistics and jellyfin streams
3. intel gpu stats
- Provides info on utilization of the gpu
All jellyfin traffic should actually go through caddy.
This port being open caused a lot of confusion for me.
As I was getting traffic from typo'd domain names,
such as `jellfin.gardling.com`, which made NO SENSE!
But since it was going directly via port 8096, it
skipped caddy entirely so the traffic went through.
- Add explicit iptables banaction in security.nix for test compatibility
- Force IPv4 in all curl requests to prevent IPv4/IPv6 mismatch issues
- Fix caddy test: use basic_auth directive (not basicauth)
- Override service ports in tests to match direct connections (not via Caddy)
- Vaultwarden: override ROCKET_ADDRESS and ROCKET_LOG for external access
- Immich: increase VM memory to 4GB for stability
- Jellyfin: create placeholder log file and reload fail2ban after startup
- Add tests.nix entries for all 6 fail2ban tests
All tests now pass: ssh, caddy, gitea, vaultwarden, immich, jellyfin