Poll /slots endpoint, create annotations when slots start processing,
close with token count when complete. Includes NixOS VM test with
mock llama-cpp and grafana servers. Dashboard annotation entry added.
Wrap entire read_one_sample() in try/except to handle all failures
(missing binary, permission errors, malformed JSON, timeouts).
Write zero-valued metrics on failure instead of exiting non-zero.
Increase timeout from 5s to 8s for slower GPU initialization.
Add sidecar service that polls llama-cpp /slots endpoint every 3s.
When any slot is processing, stops xmrig. Restarts xmrig after 10s
grace period when all slots are idle. Handles unreachable llama-cpp
gracefully (leaves xmrig untouched).
1. Smoothed out power draw
- UPS only reports on 9 watt intervals, so smoothing it out gives more
relative detail on trends
2. Add jellyfin integration
- Good for seeing correlations between statistics and jellyfin streams
3. intel gpu stats
- Provides info on utilization of the gpu
- coturn: switch static-auth-secret to static-auth-secret-file
- matrix: switch registration_token and turn_secret to file-based
- murmur: switch password to environmentFile with agenix
- p2pool: move public wallet address to service-configs.nix
This gave me a lot of panic and grief. JetKVM got NO monitor output
I was panicing and away from home. Was awful. After letting it sit off
for a few hours it fixed itself, inline with nvram state draining over
time.