Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper • 2412.04424 • Published 22 days ago • 56
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published Oct 31 • 59
Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion Paper • 2410.13674 • Published Oct 17 • 15
BenTo: Benchmark Task Reduction with In-Context Transferability Paper • 2410.13804 • Published Oct 17 • 19
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14 • 48
WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents Paper • 2410.07484 • Published Oct 9 • 48
Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA Paper • 2410.06524 • Published Oct 9 • 4
AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models Paper • 2406.10900 • Published Jun 16 • 11
Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld Paper • 2311.16714 • Published Nov 28, 2023 • 1
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning Paper • 2310.11716 • Published Oct 18, 2023 • 5
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models Paper • 2310.14566 • Published Oct 23, 2023 • 25
InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models Paper • 2306.03082 • Published Jun 5, 2023 • 5