view article Article Recipe: Preparing Multilingual Speech Datasets for TTS Training By PHBJT • 6 days ago • 13
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 8 items • Updated 3 days ago • 89
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw • 25 days ago • 18
view article Article Synthetic dataset generation techniques: Self-Instruct By davanstrien • May 15 • 11
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated Oct 1 • 41
💻 Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos • 14 items • Updated Aug 20 • 44
Papers about model merging Collection referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13 • 14
view article Article DuckDB: run SQL queries on 50,000+ datasets on the Hugging Face Hub Jun 7, 2023 • 4
SteerLM Collection A collection of models and datasets relating to SteerLM and HelpSteer. • 7 items • Updated Oct 1 • 14