llama-cpp: pause xmrig during active inference requests

Add sidecar service that polls llama-cpp /slots endpoint every 3s. When any slot is processing, stops xmrig. Restarts xmrig after 10s grace period when all slots are idle. Handles unreachable llama-cpp gracefully (leaves xmrig untouched).
2026-04-02 17:43:07 -04:00
parent 50453cf0b5
commit df15be01ea
3 changed files with 128 additions and 0 deletions
--- a/configuration.nix
+++ b/configuration.nix
@@ -65,6 +65,8 @@
    ./services/p2pool.nix
    ./services/xmrig.nix

+    ./services/llama-cpp-xmrig-pause.nix
+
    # KEEP UNTIL 2028
    ./services/caddy_senior_project.nix