view post Post 375 Reply The authors of ColPali trained a retrieval model based on SmolVLM 🤠 vidore/colsmolvlm-alphaTLDR; - ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks- ColSmolVLM is more memory efficient than ColQwen2 💗 See translation
view post Post 1981 Reply Small yet mighty! 💫We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🤠 We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39Learn more from our blog here: huggingface.co/blog/smolvlmThis release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO 💝Try the demo: HuggingFaceTB/SmolVLMFine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynbAlso TRL integration for DPO 💗 See translation
Nov 22 Releases ❄️ mistralai/Pixtral-Large-Instruct-2411 Updated 8 days ago • 382 • 341 microsoft/orca-agentinstruct-1M-v1 Viewer • Updated 27 days ago • 1.05M • 3.31k • 366 Xkev/Llama-3.2V-11B-cot Image-Text-to-Text • Updated 6 days ago • 5.24k • 86 jinaai/jina-clip-v2 Feature Extraction • Updated 1 day ago • 3.44k • 75
Nov 15 Releases 🍂 microsoft/LLM2CLIP-EVA02-L-14-336 Zero-Shot Image Classification • Updated 5 days ago • 3k • 41 microsoft/LLM2CLIP-EVA02-B-16 Updated 5 days ago • 288 • 6 PleIAs/common_corpus Viewer • Updated 5 days ago • 397M • 50.3k • 165 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • Updated 9 days ago • 92.4k • • 1.07k