Files
nixos/services
Simon Gardling bbcd662c28 xmrig-auto-pause: fix stuck state after external restart, add startup cooldown
Two bugs found during live verification on the server:

1. Stuck state after external restart: if something else restarted xmrig
   (e.g. deploy-rs activation) while paused_by_us=True, the script never
   detected this and became permanently stuck — unable to stop xmrig on
   future load because it thought xmrig was already stopped.

   Fix: when paused_by_us=True and busy, check if xmrig is actually
   running. If so, reset paused_by_us=False and re-stop it.

2. Flapping on xmrig restart: RandomX dataset init takes ~3.7s of intense
   non-nice CPU, which the script detected as real workload and immediately
   re-stopped xmrig after every restart, creating a start-stop loop.

   Fix: add STARTUP_COOLDOWN (default 10s) — after starting xmrig, skip
   CPU checks until the cooldown expires.

Both bugs were present in production: the script had been stuck since
Apr 3 (2+ days) with xmrig running unmanaged alongside llama-server.
2026-04-05 23:20:47 -04:00
..
2026-04-03 00:47:12 -04:00
2026-04-03 00:47:12 -04:00
2026-04-04 00:28:07 -04:00
2026-04-03 00:47:12 -04:00
2026-03-21 12:13:53 -04:00
2026-03-21 13:28:18 -04:00
2026-03-21 12:13:53 -04:00
2026-03-30 13:05:22 -04:00
2026-04-01 15:25:40 -04:00
2026-03-21 12:13:53 -04:00
2026-03-21 12:13:53 -04:00
2026-03-03 14:29:12 -05:00
2026-03-03 19:39:10 -05:00