Using gguf format

by Tomy99999 - opened May 21

Discussion

Tomy99999

May 21

This model was awesome. Is there any way that I could convert this into gguf format?

saishf

May 21

This model was awesome. Is there any way that I could convert this into gguf format?

From what i understand someone would need to extract the mmproj to quantize them separately as llamacpp requires two files for multi modality

Cuiunbo

OpenBMB org May 21

GGUF format is coming soon

Cuiunbo changed discussion title from Possibility to convert this into gguf to Using gguf format May 23

Cuiunbo

OpenBMB org May 23

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

Tomy99999

May 24

Thanks, this is awesome. Is there any plan to merge the update with the main llama.cpp repositories?

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

saishf

May 24

Thanks, this is awesome. Is there any plan to merge the update with the main llama.cpp repositories?

MiniCPM-Llama3-V 2.5 can run with llama.cpp now! See our fork of llama.cpp for more detail.

and here is the MiniCPM-Llama3-V-2_5-gguf
https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf

Things like LM-Studio & KoboldCPP would benefit from a merge into main 😸

Cuiunbo

OpenBMB org May 24

@saishf @Tomy99999
we are working on it. It may take some time because of the differences between minicpm-v2.0 and 2.5, that's hard to make it into one repo. And for now, we are working on Ollama.
If you guys have time for this PR, we'd appreciate it!

cmp-nct

Jun 21

For reference I created a new thread for it: https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5/discussions/45
Right now minicpm is not working on llama.cpp (or any of it's wrappers), it will load and inference but the quality is below vanilla llava-1.5. Also the fork is not compatible anymore, ollama depends on the latest llama.cpp commits. Any work needs to be in llama.cpp to survive.
So at this point using any of the llava-1.5 or llava-1.6 versions is the best way to go on llama.cpp/ollama until support for the current SOTA models (Phi-v-128 and Minicpm-2.5) is available

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment