rubbyninja
's Collections
advancing research
updated
STaR: Bootstrapping Reasoning With Reasoning
Paper
•
2203.14465
•
Published
•
8
DeepSeekMoE: Towards Ultimate Expert Specialization in
Mixture-of-Experts Language Models
Paper
•
2401.06066
•
Published
•
44
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
Language Model
Paper
•
2405.04434
•
Published
•
14
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
Paper
•
2311.04934
•
Published
•
28
Quiet-STaR: Language Models Can Teach Themselves to Think Before
Speaking
Paper
•
2403.09629
•
Published
•
75
Let's Verify Step by Step
Paper
•
2305.20050
•
Published
•
10
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Paper
•
2407.21787
•
Published
•
12
Solving math word problems with process- and outcome-based feedback
Paper
•
2211.14275
•
Published
•
8
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
136
Aligning Machine and Human Visual Representations across Abstraction
Levels
Paper
•
2409.06509
•
Published
•
1
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in
Large Language Models
Paper
•
2410.05229
•
Published
•
22
nGPT: Normalized Transformer with Representation Learning on the
Hypersphere
Paper
•
2410.01131
•
Published
•
9
Paper
•
2303.01469
•
Published
•
8
Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models
Paper
•
2410.11081
•
Published
•
19
Scaling Laws for Precision
Paper
•
2411.04330
•
Published
•
7
The Surprising Effectiveness of Test-Time Training for Abstract
Reasoning
Paper
•
2411.07279
•
Published
•
3
Test-Time Training with Self-Supervision for Generalization under
Distribution Shifts
Paper
•
1909.13231
•
Published
•
1
Better & Faster Large Language Models via Multi-token Prediction
Paper
•
2404.19737
•
Published
•
73
O1 Replication Journey: A Strategic Progress Report -- Part 1
Paper
•
2410.18982
•
Published
•
2
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
Distillation, Big Progress or Bitter Lesson?
Paper
•
2411.16489
•
Published
•
41
ReFT: Reasoning with Reinforced Fine-Tuning
Paper
•
2401.08967
•
Published
•
29
Paper
•
2408.02666
•
Published
•
27
Paper
•
2412.09764
•
Published
•
3
Leave No Context Behind: Efficient Infinite Context Transformers with
Infini-attention
Paper
•
2404.07143
•
Published
•
105
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Paper
•
1901.02860
•
Published
•
3
Large Concept Models: Language Modeling in a Sentence Representation
Space
Paper
•
2412.08821
•
Published
•
13
Movie Gen: A Cast of Media Foundation Models
Paper
•
2410.13720
•
Published
•
91