Commit Graph

770 Commits

Author SHA1 Message Date
0a927ea893 llama-cpp: maybe use vulkan?
All checks were successful
Build and Deploy / deploy (push) Successful in 8m30s
2026-04-06 02:12:46 -04:00
3e46c5bfa5 llama-cpp: use turbo3 for everything
All checks were successful
Build and Deploy / deploy (push) Successful in 1m20s
2026-04-06 01:53:11 -04:00
06aee5af77 llama-cpp: gemma 4 E4B -> gemma 4 E2B
All checks were successful
Build and Deploy / deploy (push) Successful in 2m5s
2026-04-06 01:24:25 -04:00
8fddd3a954 llama-cpp: context: 32768 -> 65536
All checks were successful
Build and Deploy / deploy (push) Successful in 2m58s
2026-04-06 01:04:23 -04:00
0e4f0d3176 llama-cpp: fix model name
All checks were successful
Build and Deploy / deploy (push) Successful in 1m18s
2026-04-06 00:59:20 -04:00
bbcd662c28 xmrig-auto-pause: fix stuck state after external restart, add startup cooldown
All checks were successful
Build and Deploy / deploy (push) Successful in 8m47s
Two bugs found during live verification on the server:

1. Stuck state after external restart: if something else restarted xmrig
   (e.g. deploy-rs activation) while paused_by_us=True, the script never
   detected this and became permanently stuck — unable to stop xmrig on
   future load because it thought xmrig was already stopped.

   Fix: when paused_by_us=True and busy, check if xmrig is actually
   running. If so, reset paused_by_us=False and re-stop it.

2. Flapping on xmrig restart: RandomX dataset init takes ~3.7s of intense
   non-nice CPU, which the script detected as real workload and immediately
   re-stopped xmrig after every restart, creating a start-stop loop.

   Fix: add STARTUP_COOLDOWN (default 10s) — after starting xmrig, skip
   CPU checks until the cooldown expires.

Both bugs were present in production: the script had been stuck since
Apr 3 (2+ days) with xmrig running unmanaged alongside llama-server.
2026-04-05 23:20:47 -04:00
324a9123db better organize related monero and matrix services
All checks were successful
Build and Deploy / deploy (push) Successful in 2m48s
2026-04-04 14:32:26 -04:00
8ea96c8b8e llama-cpp: fix model hash
All checks were successful
Build and Deploy / deploy (push) Successful in 2m36s
2026-04-04 00:28:07 -04:00
3f62b9c88e grafana: replace custom metric collectors with community exporters
Replace three custom Prometheus textfile collector scripts with
dedicated community-maintained exporters:

- jellyfin-collector.nix (25 LoC shell) -> rebelcore/jellyfin_exporter
  Metric: jellyfin_active_streams -> count(jellyfin_now_playing_state)
  Bonus: per-session labels (user, title, device, codec info)

- qbittorrent-collector.nix (40 LoC shell) -> anriha/qbittorrent-metrics-exporter
  Metric: qbittorrent_{download,upload}_bytes_per_second -> qbit_{dl,up}speed
  Bonus: per-torrent metrics with category/tag aggregation

- intel-gpu-collector.nix + .py (130 LoC Python) -> mike1808/igpu-exporter
  Metric: intel_gpu_engine_busy_percent -> igpu_engines_busy_percent
  Bonus: persistent daemon vs oneshot timer, no streaming JSON parser

All three run as persistent daemons scraped by Prometheus, replacing
the textfile-collector pattern of systemd timers writing .prom files.
Dashboard PromQL queries updated to match new metric names.
2026-04-03 15:38:13 -04:00
479ec43b8f llama-cpp: integrate native prometheus /metrics endpoint
llama.cpp server has a built-in /metrics endpoint exposing
prompt_tokens_seconds, predicted_tokens_seconds, tokens_predicted_total,
n_decode_total, and n_busy_slots_per_decode. Enable it with --metrics
and add a Prometheus scrape target, replacing the need for any external
metric collection for LLM inference monitoring.
2026-04-03 15:19:11 -04:00
37ac88fc0f lib: replace deprecated overrideDerivation with overrideAttrs
overrideDerivation has been deprecated since 2019. The new
overrideAttrs properly handles the env attribute set used by
modern derivations to avoid the NIX_CFLAGS_COMPILE overlap
error between env and top-level derivation arguments.
2026-04-03 15:18:22 -04:00
47aeb58f7a llama-cpp: do logging
All checks were successful
Build and Deploy / deploy (push) Successful in 2m27s
2026-04-03 14:39:46 -04:00
daf82c16ba fix xmrig pause 2026-04-03 14:39:20 -04:00
d4d01d63f1 llama-cpp: update + re-enable + gemma 4 E4B
Some checks failed
Build and Deploy / deploy (push) Failing after 20m16s
2026-04-03 14:06:35 -04:00
e765a98487 recyclarr: reset back to default basically
All checks were successful
Build and Deploy / deploy (push) Successful in 2m15s
2026-04-03 13:45:26 -04:00
124d33963e organize
All checks were successful
Build and Deploy / deploy (push) Successful in 2m43s
2026-04-03 00:47:12 -04:00
1451f902ad grafana: re-organize 2026-04-03 00:39:42 -04:00
8e6619097d update
Some checks failed
Build and Deploy / deploy (push) Failing after 4m59s
2026-04-03 00:20:13 -04:00
c2ff07b329 llama-cpp: disable 2026-04-03 00:17:38 -04:00
9e235abf48 monitoring: fix disk-usage-collector timer calendar spec
All checks were successful
Build and Deploy / deploy (push) Successful in 2m14s
2026-04-03 00:17:21 -04:00
096ffeb943 llama-cpp: xmrig + grafana hooks 2026-04-03 00:17:17 -04:00
ab9c12cb97 llama-cpp: general changes 2026-04-03 00:17:14 -04:00
0aeb6c5523 llama-cpp: add API key auth via --api-key-file
Some checks failed
Build and Deploy / deploy (push) Failing after 2m49s
Generate and encrypt a Bearer token for llama-cpp's built-in auth.
Remove caddy_auth from the vhost since basic auth blocks Bearer-only
clients. Internal sidecars (xmrig-pause, annotations) connect
directly to localhost and are unaffected (/slots is public).
2026-04-02 18:02:23 -04:00
bfe7a65db2 monitoring: add zpool and boot partition usage metrics
Add textfile collector for ZFS pool utilization (tank, hdds) and
boot drive partitions (/boot, /persistent, /nix). Runs every 60s.
Add two Grafana dashboard panels: ZFS Pool Utilization and Boot
Drive Partitions as Row 5.
2026-04-02 18:02:23 -04:00
e41f869843 trilium: add self-hosted note-taking service
Add trilium-server on port 8787 behind Caddy reverse proxy at
notes.sigkill.computer. Data stored on ZFS tank pool with
serviceMountWithZpool for mount ordering.
2026-04-02 17:44:04 -04:00
9baeaa5c23 llama-cpp: add grafana annotations for inference requests
Poll /slots endpoint, create annotations when slots start processing,
close with token count when complete. Includes NixOS VM test with
mock llama-cpp and grafana servers. Dashboard annotation entry added.
2026-04-02 17:43:49 -04:00
0235617627 monitoring: fix intel-gpu-collector crash resilience
Wrap entire read_one_sample() in try/except to handle all failures
(missing binary, permission errors, malformed JSON, timeouts).
Write zero-valued metrics on failure instead of exiting non-zero.
Increase timeout from 5s to 8s for slower GPU initialization.
2026-04-02 17:43:13 -04:00
df15be01ea llama-cpp: pause xmrig during active inference requests
Add sidecar service that polls llama-cpp /slots endpoint every 3s.
When any slot is processing, stops xmrig. Restarts xmrig after 10s
grace period when all slots are idle. Handles unreachable llama-cpp
gracefully (leaves xmrig untouched).
2026-04-02 17:43:07 -04:00
50453cf0b5 llama-cpp: adjust args
All checks were successful
Build and Deploy / deploy (push) Successful in 2m32s
2026-04-02 16:09:17 -04:00
bb6ea2f1d5 llama-cpp: cpu only
All checks were successful
Build and Deploy / deploy (push) Successful in 20m0s
2026-04-02 15:32:39 -04:00
f342521d46 llama-cpp: re-add w/ turboquant
All checks were successful
Build and Deploy / deploy (push) Successful in 28m52s
2026-04-02 13:42:39 -04:00
7e779ca0f7 power optimizations 2026-04-02 13:13:38 -04:00
06b2016bd6 recyclarr: things 2026-04-01 20:37:18 -04:00
f9694ae033 qbt: fix categories
All checks were successful
Build and Deploy / deploy (push) Successful in 2m24s
2026-04-01 15:25:40 -04:00
f775f22dbf recylcarr: restart service after config change 2026-04-01 15:25:31 -04:00
1bb0844649 update
Some checks failed
Build and Deploy / deploy (push) Failing after 6m22s
2026-04-01 13:12:14 -04:00
297264a34a tests: extract shared jellyfin test helpers and use real jellyfin in annotations test
Some checks failed
Build and Deploy / deploy (push) Failing after 2m35s
2026-04-01 11:24:44 -04:00
a5206b9ec6 monitoring: add grafana annotations for zfs scrub events 2026-04-01 11:24:43 -04:00
3196b38db7 tests: extract shared mock grafana server from jellyfin test 2026-04-01 11:24:43 -04:00
59d33cea3d grafana: power improvement 2026-04-01 11:24:40 -04:00
fdf57873d7 prowlarr: fix perms
All checks were successful
Build and Deploy / deploy (push) Successful in 2m23s
2026-03-31 23:31:31 -04:00
f1b7679196 grafana: remove unused stuff
All checks were successful
Build and Deploy / deploy (push) Successful in 2m31s
2026-03-31 18:55:07 -04:00
5856d835ba grafana: qbt smoothing 2026-03-31 18:54:57 -04:00
a288e18e6d grafana: qbt stats
All checks were successful
Build and Deploy / deploy (push) Successful in 1m47s
2026-03-31 17:47:53 -04:00
c6b889cea3 grafana: more things
All checks were successful
Build and Deploy / deploy (push) Successful in 2m38s
1. Smoothed out power draw
- UPS only reports on 9 watt intervals, so smoothing it out gives more
relative detail on trends
2. Add jellyfin integration
- Good for seeing correlations between statistics and jellyfin streams
3. intel gpu stats
- Provides info on utilization of the gpu
2026-03-31 17:25:06 -04:00
0027489052 grafana: smooth power draw
All checks were successful
Build and Deploy / deploy (push) Successful in 2m29s
2026-03-31 13:20:07 -04:00
ebc4c66fc3 update
All checks were successful
Build and Deploy / deploy (push) Successful in 8m58s
2026-03-31 12:52:21 -04:00
bc227a89c1 remove old secrets
All checks were successful
Build and Deploy / deploy (push) Successful in 2m12s
2026-03-31 12:47:09 -04:00
e3be112b82 grafana: init
All checks were successful
Build and Deploy / deploy (push) Successful in 2m14s
Shows powerdraw, temps, uptime, and jellyfin streams
2026-03-31 12:38:43 -04:00
5375f8ee34 gitea: add actions runner and CI/CD deploy workflow
This will avoid me having to run "deploy" myself on my laptop.
All I will need to do is push a commit and it will self-deploy.
2026-03-31 12:38:43 -04:00