O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published 6 days ago • 28
Agent Attention: On the Integration of Softmax and Linear Attention Paper • 2312.08874 • Published Dec 14, 2023 • 2