-
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 20 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 18 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2410.02884
-
Let's Verify Step by Step
Paper • 2305.20050 • Published • 9 -
LLM Critics Help Catch LLM Bugs
Paper • 2407.00215 • Published -
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper • 2407.21787 • Published • 3 -
Generative Verifiers: Reward Modeling as Next-Token Prediction
Paper • 2408.15240 • Published • 13
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 72 -
V-STaR: Training Verifiers for Self-Taught Reasoners
Paper • 2402.06457 • Published • 8 -
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
Paper • 2406.12050 • Published • 18 -
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 20
-
nvidia/Nemotron-4-340B-Base
Updated • 429 • 143 -
cognitivecomputations/dolphin-2.9.3-mistral-7B-32k
Text Generation • Updated • 4.87k • 43 -
cognitivecomputations/dolphin-2.9.3-Yi-1.5-34B-32k
Text Generation • Updated • 2.81k • 17 -
cognitivecomputations/dolphin-2.9-llama3-8b
Text Generation • Updated • 3.44k • 417
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 85 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 73 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 67 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 53
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47