-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 2 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 34 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2404.07503
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Paper • 2404.02893 • Published • 19 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29 -
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data
Paper • 2404.12195 • Published • 11
-
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 59 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 38 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 46 -
Best Practices and Lessons Learned on Synthetic Data for Language Models
Paper • 2404.07503 • Published • 29
-
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 18 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 75 -
DragAnything: Motion Control for Anything using Entity Representation
Paper • 2403.07420 • Published • 12 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 29
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 8 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 94 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 109
-
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
Paper • 2402.10379 • Published • 29 -
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Paper • 1709.07857 • Published • 2 -
Simple synthetic data reduces sycophancy in large language models
Paper • 2308.03958 • Published • 21 -
Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities
Paper • 2308.12966 • Published • 6
-
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 24 -
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
Personality Traits in Large Language Models
Paper • 2307.00184 • Published • 20 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 14
-
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 18 -
TextBind: Multi-turn Interleaved Multimodal Instruction-following
Paper • 2309.08637 • Published • 7 -
Improved Baselines with Visual Instruction Tuning
Paper • 2310.03744 • Published • 36 -
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16