Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations Paper ā¢ 2411.00640 ā¢ Published 26 days ago ā¢ 3
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays Paper ā¢ 2410.21969 ā¢ Published 30 days ago ā¢ 9
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent Paper ā¢ 2411.02265 ā¢ Published 23 days ago ā¢ 24
view article Article Breaking Barriers: The Critical Role of Art and Design in Advancing AI Capabilities By fffiloni ā¢ Jan 15 ā¢ 3
LoLCATS Collection Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! ā¢ 4 items ā¢ Updated Oct 14 ā¢ 14
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? Paper ā¢ 2409.07703 ā¢ Published Sep 12 ā¢ 66
Affordance-based Robot Manipulation with Flow Matching Paper ā¢ 2409.01083 ā¢ Published Sep 2 ā¢ 18
šŖ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos ā¢ 12 items ā¢ Updated Aug 18 ā¢ 201
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models ā¢ 11 items ā¢ Updated Sep 25 ā¢ 627
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper ā¢ 2407.03320 ā¢ Published Jul 3 ā¢ 92
DataComp-LM: In search of the next generation of training sets for language models Paper ā¢ 2406.11794 ā¢ Published Jun 17 ā¢ 49
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper ā¢ 2406.17557 ā¢ Published Jun 25 ā¢ 86