Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

updated a collection about 8 hours ago

Leaderboards and benchmarks ✨

liked a Space about 8 hours ago

MLSB/leaderboard2024

updated a Space about 11 hours ago

science/README

View all activity

Articles

Introduction to the Open Leaderboard for Japanese LLMs

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Judge Arena: Benchmarking LLMs as Evaluators

Introducing the Open FinLLM Leaderboard

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Introducing the Open Arabic LLM Leaderboard

Introducing the Open Leaderboard for Hebrew LLMs!

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Introducing the Chatbot Guardrails Arena

Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Introducing the Red-Teaming Resistance Leaderboard

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

2023, year of open LLMs

Open LLM Leaderboard: DROP deep dive

Overview of natively supported quantization schemes in 🤗 Transformers

What's going on with the Open LLM Leaderboard?

Introduction to Graph Machine Learning

Organizations

clefourrier's activity

upvoted an article 5 days ago

Article

Halo: Open Source Health Tracking with Wearables

By

•

8 days ago

• 83

upvoted an article about 1 month ago

Article

Releasing Outlines-core 0.1.0: structured generation in Rust and Python

Oct 22

• 41

upvoted 2 articles about 2 months ago

Article

Democratization of AI, Open Source, and AI Auditing: Thoughts from the DisinfoCon Panel in Berlin

By

•

Oct 8

• 5

Article

A Short Summary of Chinese AI Global Expansion

By

•

Oct 1

• 15

upvoted a collection 2 months ago

Molmo

Artifacts for open multimodal language models. • 5 items • Updated about 2 hours ago • 274

upvoted an article 4 months ago

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16

• 271

upvoted an article 5 months ago

Article

Our Transformers Code Agent beats the GAIA benchmark!

Jul 1

• 46

upvoted a paper 5 months ago

MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures

Paper • 2406.06565 • Published Jun 3 • 9

upvoted a collection 5 months ago

🎭 Avatars

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 69 items • Updated Oct 21 • 76

upvoted a paper 5 months ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25 • 86

upvoted 2 articles 6 months ago

Article

Space secrets security update

May 31

• 50

Article

Evaling llm-jp-eval (evals are hard)

By

•

May 18

• 4

upvoted 2 articles 7 months ago

Article

Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent

Apr 22

• 78

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

By

•

Apr 24

• 59

upvoted a collection 7 months ago

Granite Code Models

A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 23 days ago • 178

upvoted a paper 7 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 100

upvoted 2 articles 7 months ago

Article

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

May 3

• 13

Article

Improving Prompt Consistency with Structured Generations

Apr 30

• 57

upvoted 2 articles 8 months ago

Article

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

Jan 12

• 6

Article

An Introduction to AI Secure LLM Safety Leaderboard

Jan 26

• 5