Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14 • 17
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 17 days ago • 96
RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning Paper • 2410.02089 • Published Oct 2 • 12
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3 • 50
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • Oct 20 • 31
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise Paper • 2410.03017 • Published Oct 3 • 25
OpenMath-2 Collection A collection of models and datasets introduced in "OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data" • 7 items • Updated Oct 29 • 13
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated Sep 26 • 46
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19 • 135
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published Sep 16 • 39