-
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 10 -
Perspectives on the State and Future of Deep Learning -- 2023
Paper • 2312.09323 • Published • 5 -
MobileSAMv2: Faster Segment Anything to Everything
Paper • 2312.09579 • Published • 20 -
Point Transformer V3: Simpler, Faster, Stronger
Paper • 2312.10035 • Published • 17
Collections
Discover the best community collections!
Collections including paper arxiv:2312.09323
-
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 40 -
Perspectives on the State and Future of Deep Learning -- 2023
Paper • 2312.09323 • Published • 5 -
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
Paper • 2405.15071 • Published • 37 -
Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning
Paper • 2407.10718 • Published • 17
-
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 15 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 19 -
Perspectives on the State and Future of Deep Learning -- 2023
Paper • 2312.09323 • Published • 5 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11
-
Levels of AGI: Operationalizing Progress on the Path to AGI
Paper • 2311.02462 • Published • 33 -
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 2 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 21 -
GRIM: GRaph-based Interactive narrative visualization for gaMes
Paper • 2311.09213 • Published • 12
-
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 39 -
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 12 -
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 24