58 115 420

Yacine Jernite

yjernite

https://yjernite.github.io/

AI & ML interests

Technical, community, and regulatory tools of AI governance @HuggingFace

Recent Activity

upvoted a collection 1 day ago

OLMo 2

liked a Space 1 day ago

PR-Puppets/PR-Puppet-Sora

upvoted an article 1 day ago

Let’s make a generation of amazing image generation models

View all activity

Articles

EU Training Data Transparency: A Proposal for a Sufficiently Detailed Summary 📑📚🖼️🇪🇺

Jul 3

• 8

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

Jun 24

• 33

📚 Training Data Transparency in AI: Tools, Trends, and Policy Recommendations 🗳️

Dec 5, 2023

• 1

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

Aug 22, 2023

• 27

AI Policy @🤗: Open ML Considerations in the EU AI Act

Jul 24, 2023

• 2

AI Policy @🤗: Response to the U.S. NTIA's Request for Comment on AI Accountability

Jun 20, 2023

Hugging Face Selected for the French Data Protection Agency Enhanced Support Program

May 15, 2023

Ethics and Society Newsletter #3: Ethical Openness at Hugging Face

Mar 30, 2023

Ethics and Society Newsletter #2: Let's talk about bias!

Dec 15, 2022

Putting ethical principles at the core of research lifecycle

May 19, 2022

Introducing the Data Measurements Tool: an Interactive Tool for Looking at Datasets

Nov 29, 2021

Organizations

yjernite's activity

upvoted a collection 1 day ago

OLMo 2

Collection

Artifacts for the second set of OLMo models. • 17 items • Updated about 2 hours ago • 27

upvoted an article 1 day ago

Article

Let’s make a generation of amazing image generation models

•

1 day ago

• 29

upvoted 2 collections 5 days ago

Dataset Exploration

Collection

3 items • Updated 17 days ago • 4

Dataset transformation, preparation and edition

Collection

2 items • Updated 5 days ago • 5

upvoted an article 14 days ago

Article

Releasing the largest multilingual open pretraining dataset

•

14 days ago

• 95

upvoted 2 collections 27 days ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated about 10 hours ago • 181

2024 Interconnects Artifacts

Collection

Models & datasets mentioned in the bottom section of posts! • 249 items • Updated 2 minutes ago • 3

upvoted an article 27 days ago

Article

Detoxifying the Commons

•

27 days ago

• 6

upvoted 2 collections about 1 month ago

FLAIR models : landcover semantic segmentation

Collection

The FLAIR models is a collection of semantic segmentation models initially developed to classify land cover on very high resolution aerial imagery. • 9 items • Updated Jun 19 • 10

Pangea

Collection

A Fully Open Multilingual Multimodal LLM for 39 Languages • 18 items • Updated 26 days ago • 17

upvoted an article about 2 months ago

Article

Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin

•

Oct 8

• 5

upvoted a collection 2 months ago

Moshi v0.1 Release

Collection

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 219

upvoted an article 3 months ago

Article

Getty Images Brings High-Quality, Commercially Safe Dataset to Hugging Face

•

Sep 6

• 16

upvoted a collection 3 months ago

Qwen2-VL

Collection

Vision-language model series based on Qwen2 • 15 items • Updated Sep 18 • 158

upvoted a paper 3 months ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15 • 20

upvoted an article 3 months ago

Article

The Environmental Impacts of AI -- Primer

•

Sep 3

• 31

upvoted 2 papers 3 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22 • 121

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs

Paper • 2408.13467 • Published Aug 24 • 24

upvoted a collection 3 months ago

Multi-Vector Retrievers

Collection

2 items • Updated Aug 20 • 3

upvoted an article 3 months ago

Article

The 5 Most Under-Rated Tools on Hugging Face

Aug 22

• 85

Yacine Jernite

AI & ML interests

Recent Activity

Articles

EU Training Data Transparency: A Proposal for a Sufficiently Detailed Summary 📑📚🖼️🇪🇺

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

Public Policy at Hugging Face

Policy Questions Blog 1: AI Data Transparency Remarks for NAIAC Panel 📚🔍⚖️

AI Watermarking 101: Tools and Techniques

📚 Training Data Transparency in AI: Tools, Trends, and Policy Recommendations 🗳️

Introducing IDEFICS: An Open Reproduction of State-of-the-art Visual Language Model

AI Policy @🤗: Open ML Considerations in the EU AI Act

AI Policy @🤗: Response to the U.S. NTIA's Request for Comment on AI Accountability

Hugging Face Selected for the French Data Protection Agency Enhanced Support Program

Ethics and Society Newsletter #3: Ethical Openness at Hugging Face

Ethics and Society Newsletter #2: Let's talk about bias!

Putting ethical principles at the core of research lifecycle

Introducing the Data Measurements Tool: an Interactive Tool for Looking at Datasets

Organizations

yjernite's activity

Let’s make a generation of amazing image generation models

Releasing the largest multilingual open pretraining dataset

Detoxifying the Commons

Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin

Getty Images Brings High-Quality, Commercially Safe Dataset to Hugging Face

The Environmental Impacts of AI -- Primer

The 5 Most Under-Rated Tools on Hugging Face