Thomas Wolf's picture

Thomas Wolf PRO

thomwolf

·

https://thomwolf.io

AI & ML interests

NLP and open-source :-)

Recent Activity

liked a model 1 day ago

AIDC-AI/Marco-o1

liked a model 1 day ago

nvidia/Hymba-1.5B-Instruct

Reacted to davanstrien's post with ❤️ 1 day ago

First dataset for the new Hugging Face Bluesky community organisation: https://huggingface.co/datasets/bluesky-community/one-million-bluesky-posts 🦋 📊 1M public posts from Bluesky's firehose API 🔍 Includes text, metadata, and language predictions 🔬 Perfect to experiment with using ML for Bluesky 🤗 Excited to see people build more open tools for a more open social media platform!

View all activity

Articles

FineVideo: behind the scenes

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

A failed experiment: Infini-Attention, and why we should keep trying?

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Constitutional AI with Open LLMs

Open LLM Leaderboard: DROP deep dive

What's going on with the Open LLM Leaderboard?

Can foundation models label data like humans?

Organizations

thomwolf's activity

upvoted a paper 3 days ago

Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations

Paper • 2411.00640 • Published 26 days ago • 3

upvoted an article 9 days ago

Article

The Rise of Agentic Data Generation

By

•

Jul 15

• 78

upvoted a paper 19 days ago

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

Paper • 2410.21969 • Published 30 days ago • 9

upvoted a paper 23 days ago

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published 23 days ago • 24

upvoted an article about 1 month ago

Article

Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities

By

•

Jan 15

• 3

upvoted a collection about 1 month ago

LoLCATS

Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! • 4 items • Updated Oct 14 • 14

upvoted a paper about 1 month ago

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published Sep 12 • 66

upvoted a paper about 2 months ago

EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 24

upvoted 2 papers 3 months ago

Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1 • 19

Affordance-based Robot Manipulation with Flow Matching

Paper • 2409.01083 • Published Sep 2 • 18

upvoted an article 3 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

upvoted 2 collections 4 months ago

InternLM2.5

14 items • Updated Sep 14 • 70

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 201

upvoted 2 articles 4 months ago

Article

Announcing BigCodeBench-Hard, and More

By

•

Jul 24

• 10

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 273

upvoted a collection 4 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Sep 25 • 627

upvoted 4 papers 5 months ago

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

OLMES: A Standard for Language Model Evaluations

Paper • 2406.08446 • Published Jun 12 • 2

DataComp-LM: In search of the next generation of training sets for language models

Paper • 2406.11794 • Published Jun 17 • 49

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 86