Commit Graph

25 Commits

Author SHA1 Message Date
84b0913fa6 llama-cpp: use gpt-oss-20b-mxfp4 2026-03-03 14:28:58 -05:00
07b4fc2d90 extend nixpkgs's lib instead 2026-03-03 14:28:46 -05:00
432d53318a DeepSeek-R1-0528-Qwen3-8B 2026-03-03 14:28:25 -05:00
22f6682cee llama-cpp: use q8 quantization instead of q4 2026-03-03 14:28:22 -05:00
5835da1f7b llama-cpp: disable gpu 2026-03-03 14:28:21 -05:00
c8c150e10c llama-cpp: vulkan broken 2026-03-03 14:28:21 -05:00
efb0bd38e8 llama-cpp: disable flash attn 2026-03-03 14:28:20 -05:00
0f46de5eb7 llama-cpp: nvidia-acereason-7b 2026-03-03 14:28:19 -05:00
cf3e032acb llm: use vulkan 2026-03-03 14:28:00 -05:00
51704a0543 llm: use xiomo model 2026-03-03 14:27:58 -05:00
a5f4f65894 deepcoder 14b 2026-03-03 14:27:47 -05:00
06f47a32af change llm model 2026-03-03 14:27:45 -05:00
d52154770e llm: model stuff 2026-03-03 14:27:40 -05:00
5161e62433 create single function to optimize for system 2026-03-03 14:27:37 -05:00
99978c108b move optimizeWithFlags 2026-03-03 14:27:37 -05:00
3c727db2b2 fmt 2026-03-03 14:27:32 -05:00
d8d90a2cfd llm: use finetuned model 2026-03-03 14:27:30 -05:00
3119cc3594 gemma-3 27b 2026-03-03 14:27:30 -05:00
516e2391a7 llm: use Q4_0 quants (faster) 2026-03-03 14:27:29 -05:00
8ac8f70700 format 2026-03-03 14:27:29 -05:00
4a3b1b14f2 llm: enable AVX2 2026-03-03 14:27:28 -05:00
75ea442642 llama-cpp: compiler optimizations 2026-03-03 14:27:27 -05:00
6cd839cdce gemma-3 12b 2026-03-03 14:27:27 -05:00
6097b3ce0f auth for llm 2026-03-03 14:27:26 -05:00
925031c640 add llama-server 2026-03-03 14:27:26 -05:00