Merve Noyan's picture

Merve Noyan

merve

·

AI & ML interests

VLMs, vision & co

Recent Activity

posted an update about 9 hours ago

The authors of ColPali trained a retrieval model based on SmolVLM 🤠 https://huggingface.co/vidore/colsmolvlm-alpha TLDR; - ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks - ColSmolVLM is more memory efficient than ColQwen2 💗

updated a Space about 11 hours ago

HuggingFaceTB/SmolVLM

liked a model 1 day ago

HuggingFaceTB/SmolLM2-1.7B

View all activity

Articles

SmolVLM - small yet mighty Vision Language Model

Llama can now see and run on your device - welcome Llama 3.2

Preference Optimization for Vision Language Models

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

PaliGemma – Google's Cutting-Edge Open Vision Language Model

Vision Language Models Explained

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Deploy MusicGen in no time with Inference Endpoints

Open-Source Text Generation & LLM Ecosystem at Hugging Face

Jupyter X Hugging Face

Using Machine Learning to Aid Survivors and Race through Time

Introducing Skops

Announcing the Hugging Face Fellowship Program

Hosting your Models and Datasets on Hugging Face Spaces using Streamlit

Showcase Your Projects in Spaces using Gradio

Organizations

Posts 77

Post

375

The authors of ColPali trained a retrieval model based on SmolVLM 🤠 vidore/colsmolvlm-alpha
TLDR;

- ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks

- ColSmolVLM is more memory efficient than ColQwen2 💗

Post

1981

Small yet mighty! 💫

We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🤠

We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39

Learn more from our blog here: huggingface.co/blog/smolvlm
This release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO 💝
Try the demo: HuggingFaceTB/SmolVLM
Fine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Also TRL integration for DPO 💗

Collections 35

spaces 104

Running on Zero

OWLSAM

State-of-the-art open-vocabulary image segmentation ⚡️

Sam2.1

SuperPoint

Running on CPU Upgrade

Gradio Tgi

Vision Papers

OWLSAM2

models 87

merve/paligemma_vqav2

Image-Text-to-Text • Updated 20 days ago • 309 • 10

merve/google-ckpts

merve/google-tokenizers

merve/idefics3-llama-vqav2

merve/idefics3llama-vqav2

Updated Sep 11 • 8

merve/flux-dreambooth-lora

Updated Aug 16 • 1

merve/trained-flux-lora-lego

Text-to-Image • Updated Aug 16 • 8 • • 1

merve/flux-lego-lora-dreambooth

Text-to-Image • Updated Aug 16 • 2.91k • • 13

merve/sam2-hiera-large

Mask Generation • Updated Aug 2 • 18.3k • 2

merve/sam2-hiera-base-plus

Mask Generation • Updated Aug 2 • 40

datasets 26

merve/model-test-inputs

Updated Oct 21 • 40

merve/vqav2-small

Viewer • Updated Aug 8 • 21.4k • 854 • 8

merve/SGinW

Preview • Updated Jul 11 • 461

merve/pascal-voc

Viewer • Updated Jul 6 • 336k • 521

merve/YouCook2

Viewer • Updated May 28 • 2k • 64

merve/faiss_embeddings

Updated Jan 25 • 19

merve/pokemon-ds-embeddings

Viewer • Updated Jan 10 • 833 • 64 • 4

merve/tr-h4-norobots

Updated Jan 7 • 70 • 10

merve/lego_sets_latest

Viewer • Updated Jan 6 • 61 • 160 • 2

merve/ai-tube-dummy

Updated Dec 1, 2023 • 52