need new models versions (revisions) gguf files

#25
by mohan007 - opened

need new models versions (revisions = 2024-05-20) gguf files

It's currently not compatible with llama.cpp due to an update to the projection architecture, working on it.

hi @vikhyatk in that case what is the best possible way to speed up inference for moondream2 , currently using huggingface and batch inference , on nvidia gpu 4090 (for reference)

Sign up or log in to comment