Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

updated a collection about 8 hours ago

Leaderboards and benchmarks ✨

liked a Space about 8 hours ago

MLSB/leaderboard2024

updated a Space about 11 hours ago

science/README

View all activity

Articles

Introduction to the Open Leaderboard for Japanese LLMs

Letting Large Models Debate: The First Multilingual LLM Debate Competition

Judge Arena: Benchmarking LLMs as Evaluators

Introducing the Open FinLLM Leaderboard

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Introducing the Open Arabic LLM Leaderboard

Introducing the Open Leaderboard for Hebrew LLMs!

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Introducing the Chatbot Guardrails Arena

Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Introducing the Red-Teaming Resistance Leaderboard

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

2023, year of open LLMs

Open LLM Leaderboard: DROP deep dive

Overview of natively supported quantization schemes in 🤗 Transformers

What's going on with the Open LLM Leaderboard?

Introduction to Graph Machine Learning

Organizations

clefourrier's activity

liked a Space about 8 hours ago

Leaderboard2024

liked a Space 8 days ago

Running on CPU Upgrade

Open Japanese LLM Leaderboard

liked 3 datasets 8 days ago

juletxara/mgsm

Viewer • Updated May 9, 2023 • 2.84k • 3.29k • 24

KbsdJames/Omni-MATH

Viewer • Updated Oct 12 • 4.43k • 726 • 58

math-ai/TemplateGSM

Viewer • Updated about 17 hours ago • 11.6M • 624 • 12

liked a Space 9 days ago

Japanese Chatbot Arena Leaderboard

liked a Space 10 days ago

Persian LLM Leaderboard

Persian LLM Leaderboard

liked a Space 12 days ago

Judge Arena

liked a Space 13 days ago

Running on CPU Upgrade

Open LLM Leaderboard

Track, rank and evaluate open LLMs and chatbots

liked a Space 21 days ago

Giskard Evaluator

liked 2 models about 1 month ago

CohereForAI/aya-expanse-32b

Text Generation • Updated 27 days ago • 32.3k • 172

CohereForAI/aya-expanse-8b

Text Generation • Updated 28 days ago • 48.3k • 288

liked a dataset about 1 month ago

TAUR-Lab/MuSR

Viewer • Updated May 21 • 756 • 8.02k • 10

liked a Space about 1 month ago

Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks

liked a model about 1 month ago

HuggingFaceTB/SmolLM-1.7B

Text Generation • Updated Oct 16 • 9.95k • 163

liked a Space about 1 month ago

GPU Poor LLM Arena

Compact LLM Battle Arena: Frugal AI Face-Off!

liked a dataset about 1 month ago

Weyaxi/huggingface-leaderboard

Updated about 12 hours ago • 3.62k • 9

liked a Space about 1 month ago

Open LLM Leaderboard Model Comparator

Compare Open LLM Leaderboard results

liked 2 Spaces about 2 months ago

LLM Performance Leaderboard

Leaderboard