Simon Gardling 479ec43b8f llama-cpp: integrate native prometheus /metrics endpoint
llama.cpp server has a built-in /metrics endpoint exposing
prompt_tokens_seconds, predicted_tokens_seconds, tokens_predicted_total,
n_decode_total, and n_busy_slots_per_decode. Enable it with --metrics
and add a Prometheus scrape target, replacing the need for any external
metric collection for LLM inference monitoring.
2026-04-03 15:19:11 -04:00
2026-03-03 14:26:42 -05:00
2026-03-21 10:26:28 -04:00
2026-03-03 14:30:43 -05:00
2026-04-03 14:39:20 -04:00
2026-03-03 14:29:00 -05:00
2026-04-03 14:39:20 -04:00
2026-03-03 14:30:47 -05:00
2026-04-02 13:42:39 -04:00
Description
Unified NixOS flake for mreow, yarn, muffin
3.7 MiB
Languages
Nix 84.6%
Python 10.7%
Emacs Lisp 2.6%
Shell 2.1%