llama-cpp: add grafana annotations for inference requests
Poll /slots endpoint, create annotations when slots start processing, close with token count when complete. Includes NixOS VM test with mock llama-cpp and grafana servers. Dashboard annotation entry added.
This commit is contained in:
@@ -28,6 +28,9 @@ in
|
||||
# zfs scrub annotations test
|
||||
zfsScrubAnnotationsTest = handleTest ./zfs-scrub-annotations.nix;
|
||||
|
||||
# llama-cpp annotation service test
|
||||
llamaCppAnnotationsTest = handleTest ./llama-cpp-annotations.nix;
|
||||
|
||||
# ntfy alerts test
|
||||
ntfyAlertsTest = handleTest ./ntfy-alerts.nix;
|
||||
|
||||
|
||||
Reference in New Issue
Block a user