Collections
Discover the best community collections!
Collections including paper arxiv:2409.18869
-
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
Paper • 2411.02959 • Published • 61 -
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details
Paper • 2411.03047 • Published • 7 -
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
Paper • 2411.02336 • Published • 23 -
GenXD: Generating Any 3D and 4D Scenes
Paper • 2411.02319 • Published • 20
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 602 -
CLEAR: Character Unlearning in Textual and Visual Modalities
Paper • 2410.18057 • Published • 199 -
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders
Paper • 2410.22366 • Published • 73 -
Emu3: Next-Token Prediction is All You Need
Paper • 2409.18869 • Published • 91
-
Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages
Paper • 2410.16153 • Published • 42 -
AutoTrain: No-code training for state-of-the-art models
Paper • 2410.15735 • Published • 57 -
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Paper • 2410.12787 • Published • 30 -
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks
Paper • 2410.01744 • Published • 25
-
Addition is All You Need for Energy-efficient Language Models
Paper • 2410.00907 • Published • 144 -
Emu3: Next-Token Prediction is All You Need
Paper • 2409.18869 • Published • 91 -
An accurate detection is not all you need to combat label noise in web-noisy datasets
Paper • 2407.05528 • Published • 3 -
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP
Paper • 2407.00402 • Published • 22
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 136 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 87 -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper • 2409.02634 • Published • 89 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 107