need new models versions (revisions) gguf files
#25
by
mohan007
- opened
need new models versions (revisions = 2024-05-20) gguf files
It's currently not compatible with llama.cpp due to an update to the projection architecture, working on it.
hi @vikhyatk in that case what is the best possible way to speed up inference for moondream2 , currently using huggingface and batch inference , on nvidia gpu 4090 (for reference)