Daniel van Strien PRO

davanstrien

AI & ML interests

Machine Learning Librarian

Articles

Organizations

davanstrien's activity

replied to their post 10 days ago
view reply

Thanks for the great video! Very nice that Binary Quantization works well too!

posted an update 10 days ago
reacted to malhajar's post with ❤️ 18 days ago
view post
Post
3554
🇫🇷 Lancement officiel de l'OpenLLM French Leaderboard : initiative open-source pour référencer l’évaluation des LLMs francophones

Après beaucoup d’efforts et de sueurs avec Alexandre Lavallee, nous sommes ravis d’annoncer que le OpenLLMFrenchLeaderboard est en ligne sur Hugging Face (space url: le-leadboard/OpenLLMFrenchLeaderboard) la toute première plateforme dédiée à l’évaluation des grands modèles de langage (LLM) en français. 🇫🇷✨

Ce projet de longue haleine est avant tout une œuvre de passion mais surtout une nécessité absolue. Il devient urgent et vital d'oeuvrer à plus de transparence dans ce domaine stratégique des LLM dits multilingues. La première pièce à l'édifice est donc la mise en place d'une évaluation systématique et systémique des modèles actuels et futurs.

Votre modèle IA français est-il prêt à se démarquer ? Soumettez le dans notre espace, et voyez comment vous vous comparez par rapport aux autres modèles.

❓ Comment ça marche :
Soumettez votre LLM français pour évaluation, et nous le testerons sur des benchmarks de référence spécifiquement adaptés pour la langue française — notre suite de benchmarks comprend :

- BBH-fr : Raisonnement complexe
- IFEval-fr : Suivi d'instructions
- GPQA-fr : Connaissances avancées
- MUSR-fr : Raisonnement narratif
- MATH_LVL5-fr : Capacités mathématiques
- MMMLU-fr : Compréhension multitâche

Le processus est encore manuel, mais nous travaillons sur son automatisation, avec le soutien de la communauté Hugging Face.

@clem , on se prépare pour une mise à niveau de l’espace ? 😏👀

Ce n'est pas qu'une question de chiffres—il s'agit de créer une IA qui reflète vraiment notre langue, notre culture et nos valeurs. OpenLLMFrenchLeaderboard est notre contribution personnelle pour façonner l'avenir des LLM en France.
  • 1 reply
·
posted an update about 1 month ago
reacted to jsulz's post with 🚀 about 1 month ago
view post
Post
1796
In August, the XetHub team joined Hugging Face
- https://huggingface.co/blog/xethub-joins-hf - and we’ve been rolling up our sleeves to bring the best of both worlds together. We started with a deep dive into the current state of files stored with Git LFS on the Hub.

Getting this information was no small feat. We had to:
* Analyze a complete database dump of all repositories and files stored in Git LFS across Hugging Face.
* Parse through metadata on file sizes and types to accurately map the storage breakdown across Spaces, Models, and Datasets.

You can read more about the findings (with some jaw-dropping stats + charts) here https://www.linkedin.com/feed/update/urn:li:activity:7244486280351285248
reacted to asoria's post with 👍 about 1 month ago
reacted to m-ric's post with 🔥 about 1 month ago
view post
Post
3204
🌎 𝐓𝐡𝐞 𝐟𝐢𝐫𝐬𝐭 𝐞𝐯𝐞𝐫 𝐅𝐨𝐮𝐧𝐝𝐚𝐭𝐢𝐨𝐧 𝐰𝐞𝐚𝐭𝐡𝐞𝐫 𝐦𝐨𝐝𝐞𝐥: 𝐏𝐫𝐢𝐭𝐡𝐯𝐢 𝐖𝐱𝐂 𝐞𝐧𝐚𝐛𝐥𝐞𝐬 𝐥𝐢𝐟𝐞-𝐬𝐚𝐯𝐢𝐧𝐠 𝐰𝐞𝐚𝐭𝐡𝐞𝐫 𝐩𝐫𝐞𝐝𝐢𝐜𝐭𝐢𝐨𝐧𝐬

Hurricane Katrina killed hundreds of people as it made landfall on New Orleans in 2005 - many of these deaths could have been avoided if alerts had been given one day earlier. Accurate weather forecasts are really life-saving.

🔥 Now, NASA and IBM just dropped a game-changing new model: the first ever foundation model for weather! This means, it's the first time we have a generalist model not restricted to one task, but able to predict 160 weather variables!

Prithvi WxC (Prithvi, “पृथ्वी”, is the Sanskrit name for Earth) - is a 2.3 billion parameter model, with an architecture close to previous vision transformers like Hiera.

💡 But it comes with some important tweaks: under the hood, Prithvi WxC uses a clever transformer-based architecture with 25 encoder and 5 decoder blocks. It alternates between "local" and "global" attention to capture both regional and global weather patterns.

𝗞𝗲𝘆 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀:
🔮 Nails short-term forecasts - Prithvi WxC crushed it on 6-12 hour predictions, even outperforming some traditional numerical weather models
🌀 Tracks hurricanes like a champ - For Hurricane Ida, it predicted the landfall location within 5 km (vs 20+ km errors from other AI models), which is a huge progress!
🔍 6x downscaling power - Can zoom in on weather data to 6x higher resolution with 4x lower error than basic methods
🌊 Models elusive gravity waves - Accurately simulates these crucial but hard-to-capture atmospheric oscillations

As climate change intensifies, tools like Prithvi WxC will become more and more crucial to avoid disasters!

Announcement post 👉 https://newsroom.ibm.com/2024-09-23-ibm-and-nasa-release-open-source-ai-model-on-hugging-face-for-weather-and-climate-applications

Model on the Hub 👉 https://huggingface.co/Prithvi-WxC

Thank you @clem for highlighting it!
reacted to fdaudens's post with 🚀🔥 about 1 month ago
reacted to fdaudens's post with 🔥 about 1 month ago
view post
Post
1782
Exciting news in AI: Molmo, a groundbreaking family of open-source multimodal models, has just been announced! 🚀

Key points:
- Closes the gap with proprietary systems on benchmarks & human evals
- Trained on high-quality data (< 1M image-text pairs vs billions)
- Introduces pointing capability for rich interactions
- Fully open weights, data, and training code

The 72B model outperforms several proprietary systems, while the 1B model nearly matches GPT-4V. Small is indeed the new big in AI!

There's an interactive demo available using Molmo-7B-D. Definitely worth checking out to see its capabilities firsthand.

All model weights, data, and code will be released soon. This is a significant step towards truly open, cutting-edge multimodal AI.
The future of AI research and applications is looking brighter than ever! 🤖🖼️

👉 Demo: https://molmo.allenai.org/
👉 Models: allenai/molmo-66f379e6fe3b8ef090a8ca19

#AI #MachineLearning #OpenSource #ComputerVision
posted an update about 2 months ago
view post
Post
2177
Yesterday, I shared a blog post on generating data for fine-tuning ColPali using the Qwen/Qwen2-VL-7B-Instruct model.

To simplify testing this approach, I created a Space that lets you generate queries from an input document page image: davanstrien/ColPali-Query-Generator

I think there is much room for improvement, but I'm excited about the potential for relatively small VLMs to create synthetic data.

You can read the original blog post that goes into more detail here: https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html
reacted to pain's post with ❤️ about 2 months ago
posted an update about 2 months ago
view post
Post
3118
ColPali is revolutionizing multimodal retrieval, but could it be even more effective with domain-specific fine-tuning?

Check out my latest blog post, where I guide you through creating a ColPali fine-tuning dataset using Qwen/Qwen2-VL-7B-Instruct to generate queries for a collection of UFO documents sourced from the Internet Archive.

The post covers:
- Introduction to data for ColPali models
- Using Qwen2-VL for retrieval query generation
- Tips for better query generation

Check out the post here:
https://danielvanstrien.xyz/posts/post-with-code/colpali/2024-09-23-generate_colpali_dataset.html

The resulting Hugging Face dataset: davanstrien/ufo-ColPali
  • 1 reply
·
reacted to MoritzLaurer's post with 👀 about 2 months ago
view post
Post
1889
The new NIM Serverless API by HF and Nvidia is a great option if you want a reliable API for open-weight LLMs like Llama-3.1-405B that are too expensive to run on your own hardware.

- It's pay-as-you-go, so it doesn't have rate limits like the standard HF Serverless API and you don't need to commit to hardware like for a dedicated endpoint.
- It works out-of-the box with the new v0.25 release of our huggingface_hub.InferenceClient
- It's specifically tailored to a small collection of popular open-weight models. For a broader selection of open models, we recommend using the standard HF Serverless API.
- Note that you need a token from an Enterprise Hub organization to use it.

Details in this blog post: https://huggingface.co/blog/inference-dgx-cloud
Compatible models in this HF collection: nvidia/nim-serverless-inference-api-66a3c6fcdcb5bbc6e975b508
Release notes with many more features of huggingface_hub==0.25.0: https://github.com/huggingface/huggingface_hub/releases/tag/v0.25.0

Copy-pasteable code in the first comment:
  • 2 replies
·
replied to MoritzLaurer's post about 2 months ago
view reply

Very exciting to see this! I often want to use an LLM for a short period, and setting up a whole endpoint for this can be overkill. This seems like a very neat solution!

Do you think there is a chance that any VLMs will be added soon!?

posted an update about 2 months ago
view post
Post
446
🛸 I'm working on a pipeline for creating domain-specific ColPali fine-tuning datasets using a collection of UFO newsletters from the Internet Archive as a case study.

I will have a full notebook to share on Monday, but you can already take a look at the dataset here: davanstrien/ufo-ColPali
reacted to albertvillanova's post with ❤️👍 about 2 months ago
posted an update about 2 months ago
view post
Post
1687
Almost ready: search for a Hugging Face dataset on the Hub from information in the datasets viewer preview!

Soon, you can find deep-cut datasets even if they don't have a full dataset card (you should still document your datasets!)

You can help improve this project by rating synthetic user search queries for hub datasets.

If you have a Hub login, you can start annotating in Argilla
in < 5 seconds here: https://davanstrien-my-argilla.hf.space/dataset/1100a091-7f3f-4a6e-ad51-4e859abab58f/annotation-mode

I need to do some tidying, but I'll share all the code and in-progress datasets for this soon!
posted an update 2 months ago