Closed
Description
As of this PR to llama.cpp the CUDA binaries are capable of running with CPU only, as long as n_gpu_layers = 0
.
This might mean that we can significantly simplify our distribution of binaries by removing the CPU only variants and only shiping CUDA ones.
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
✅ Done