Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback Paper • 2501.03916 • Published 8 days ago • 14
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 7 days ago • 78
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 8 days ago • 75
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 6 days ago • 65
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains Paper • 2501.05707 • Published 6 days ago • 16
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training Paper • 2501.06842 • Published 3 days ago • 13
Diving into Self-Evolving Training for Multimodal Reasoning Paper • 2412.17451 • Published 23 days ago • 42
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 23 days ago • 39
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published 27 days ago • 15
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 16 days ago • 35
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published 13 days ago • 24