Daniel van Strien PRO

davanstrien

AI & ML interests

Machine Learning Librarian

Recent Activity

Articles

Organizations

davanstrien's activity

Reacted to andito's post with πŸ”₯ about 7 hours ago
view post
Post
372
Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

- SmolVLM generates tokens 7.5 to 16 times faster than Qwen2-VL! 🀯
- Other models at this size crash a laptop, but SmolVLM comfortably generates 17 tokens/sec on a macbook! πŸš€
- SmolVLM can be fine-tuned on a Google collab! Or process millions of documents with a consumer GPU!
- SmolVLM even outperforms larger models in video benchmarks, despite not even being trained on videos!

Check out more!
Demo: HuggingFaceTB/SmolVLM
Blog: https://huggingface.co/blog/smolvlm
Model: HuggingFaceTB/SmolVLM-Instruct
Fine-tuning script: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Reacted to nataliaElv's post with πŸ‘€ about 10 hours ago
view post
Post
473
Would you like to get a high-quality dataset to pre-train LLMs in your language? 🌏

At Hugging Face we're preparing a collaborative annotation effort to build an open-source multilingual dataset as part of the Data is Better Together initiative.

Follow the link below, check if your language is listed and sign up to be a Language Lead!

https://forms.gle/s9nGajBh6Pb9G72J6
New activity in bluesky-community/one-million-bluesky-posts about 22 hours ago

🚩 Report: Legal issue(s)

1
#13 opened about 22 hours ago by tigeryfan

Discussion about dataset removal

39
#12 opened about 22 hours ago by tobiasdrundridge
New activity in bluesky-community/one-million-bluesky-posts about 23 hours ago

Language tags

1
#1 opened 1 day ago by nataliaElv

Opt out?

2
#2 opened 1 day ago by John-breen

CCPA Compliance

1
#7 opened about 23 hours ago by ogma1

🚩 Report: Legal issue(s)

4
#6 opened about 23 hours ago by RoscoG

🚩 Report: Legal issue(s)

1
#10 opened about 23 hours ago by ShadowKyogre

Add note about the removal of the data

#11 opened about 23 hours ago by davanstrien
New activity in bluesky-community/one-million-bluesky-posts about 23 hours ago

remove base data

#8 opened about 23 hours ago by davanstrien

Delete with-language-predictions

#9 opened about 23 hours ago by davanstrien