Large Language Models for Mathematical Reasoning: Progresses and Challenges Paper • 2402.00157 • Published Jan 31 • 1
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper • 2404.05567 • Published Apr 8 • 10
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding Paper • 2402.16671 • Published Feb 26 • 26
Programming datasets Collection Web-scrapes of datasets to boost code performance • 4 items • Updated Oct 9 • 1
haiku Collection 🌸 This is a collection of synthetic datasets built to help improve the ability of open language models to better write haikus through the use of DPO • 3 items • Updated Jun 21 • 6
LLM Augmented LLMs: Expanding Capabilities through Composition Paper • 2401.02412 • Published Jan 4 • 36
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset Paper • 2309.04662 • Published Sep 9, 2023 • 22