How do I convert v0 to v1 for the new llama.cpp?
#1
by
jimaldon
- opened
The current v0 is incompatible with llama.cpp
Oops I completely forgot about this one. I'll do it later today.
You'll need the huggingface converted pytorch files. And then merge them into a single file. There should be a script for hf to pth. Then convert the pth file into a ggml f32 file (option 0). Then quantize it to q4_1 (option 3)
Any updates on the q4_1 model?