Using gguf format

#1
by Tomy99999 - opened

This model was awesome. Is there any way that I could convert this into gguf format?

This model was awesome. Is there any way that I could convert this into gguf format?

From what i understand someone would need to extract the mmproj to quantize them separately as llamacpp requires two files for multi modality

OpenBMB org

GGUF format is coming soon

Cuiunbo changed discussion title from Possibility to convert this into gguf to Using gguf format
OpenBMB org

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

Thanks, this is awesome. Is there any plan to merge the update with the main llama.cpp repositories?

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

Thanks, this is awesome. Is there any plan to merge the update with the main llama.cpp repositories?

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

Things like LM-Studio & KoboldCPP would benefit from a merge into main 😸

OpenBMB org

@saishf @Tomy99999
we are working on it. It may take some time because of the differences between minicpm-v2.0 and 2.5, that's hard to make it into one repo. And for now, we are working on Ollama.
If you guys have time for this PR, we'd appreciate it!

For reference I created a new thread for it: https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/45
Right now minicpm is not working on llama.cpp (or any of it's wrappers), it will load and inference but the quality is below vanilla llava-1.5. Also the fork is not compatible anymore, ollama depends on the latest llama.cpp commits. Any work needs to be in llama.cpp to survive.
So at this point using any of the llava-1.5 or llava-1.6 versions is the best way to go on llama.cpp/ollama until support for the current SOTA models (Phi-v-128 and Minicpm-2.5) is available

Sign up or log in to comment