9 Commits

Author SHA1 Message Date
dad3867144 grafana: fix llama-cpp annotation query format for Grafana 12
All checks were successful
Build and Deploy / deploy (push) Successful in 2m42s
Grafana 12 expects Prometheus annotation queries wrapped in a 'target'
object with datasource, expr, refId, and range fields. The previous
format had expr/step as top-level fields which Grafana silently ignored.
2026-04-09 22:19:21 -04:00
e9ce1ce0a2 grafana: replace llama-cpp-annotations daemon with prometheus query 2026-04-09 19:54:57 -04:00
a3a6700106 grafana: replace disk-usage-collector with prometheus-zfs-exporter
The custom disk-usage-collector shell script + minutely timer is replaced
by prometheus-zfs-exporter (pdf/zfs_exporter, packaged in nixpkgs as
services.prometheus.exporters.zfs). The exporter provides pool capacity
metrics (allocated/free/size) natively.

Partition metrics (/boot, /persistent, /nix) now use node_exporter's
built-in filesystem collector (node_filesystem_*_bytes) which already
runs and collects these metrics.

Also fixes a latent race condition in serviceMountWithZpool: the -mounts
service now orders after zfs-mount.service (which runs 'zfs mount -a'),
not just after pool import. Without this, the mount check could run
before datasets are actually mounted.
2026-04-09 19:54:57 -04:00
75319256f3 lib: add mkCaddyReverseProxy, mkFail2banJail, mkGrafanaAnnotationService, extractArrApiKey 2026-04-09 19:54:57 -04:00
0df5d98770 grafana: use postgresql
All checks were successful
Build and Deploy / deploy (push) Successful in 2m45s
Doesn't use for data, only annotation and other stuff
2026-04-07 12:44:59 -04:00
2848c7e897 grafana: keep data forever 2026-04-07 12:44:46 -04:00
3f62b9c88e grafana: replace custom metric collectors with community exporters
Replace three custom Prometheus textfile collector scripts with
dedicated community-maintained exporters:

- jellyfin-collector.nix (25 LoC shell) -> rebelcore/jellyfin_exporter
  Metric: jellyfin_active_streams -> count(jellyfin_now_playing_state)
  Bonus: per-session labels (user, title, device, codec info)

- qbittorrent-collector.nix (40 LoC shell) -> anriha/qbittorrent-metrics-exporter
  Metric: qbittorrent_{download,upload}_bytes_per_second -> qbit_{dl,up}speed
  Bonus: per-torrent metrics with category/tag aggregation

- intel-gpu-collector.nix + .py (130 LoC Python) -> mike1808/igpu-exporter
  Metric: intel_gpu_engine_busy_percent -> igpu_engines_busy_percent
  Bonus: persistent daemon vs oneshot timer, no streaming JSON parser

All three run as persistent daemons scraped by Prometheus, replacing
the textfile-collector pattern of systemd timers writing .prom files.
Dashboard PromQL queries updated to match new metric names.
2026-04-03 15:38:13 -04:00
479ec43b8f llama-cpp: integrate native prometheus /metrics endpoint
llama.cpp server has a built-in /metrics endpoint exposing
prompt_tokens_seconds, predicted_tokens_seconds, tokens_predicted_total,
n_decode_total, and n_busy_slots_per_decode. Enable it with --metrics
and add a Prometheus scrape target, replacing the need for any external
metric collection for LLM inference monitoring.
2026-04-03 15:19:11 -04:00
1451f902ad grafana: re-organize 2026-04-03 00:39:42 -04:00