Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.18869

BAAI/Emu3-Gen

Any-to-Any • Updated 27 days ago • 16.1k • 187
BAAI/Emu3-Chat

Text Generation • Updated 26 days ago • 4.75k • 71
BAAI/Emu3-VisionTokenizer

Feature Extraction • Updated Oct 8 • 24.1k • 51
BAAI/Emu3-Stage1

Any-to-Any • Updated 27 days ago • 6.08k • 25

Interesting Papers

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Paper • 2411.02959 • Published 14 days ago • 61
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

Paper • 2411.03047 • Published 14 days ago • 7
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Paper • 2411.02336 • Published 15 days ago • 23
GenXD: Generating Any 3D and 4D Scenes

Paper • 2411.02319 • Published 15 days ago • 20

浏览论文收藏

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602
CLEAR: Character Unlearning in Textual and Visual Modalities

Paper • 2410.18057 • Published 27 days ago • 199
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published 22 days ago • 73
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

Paper • 2410.16153 • Published 29 days ago • 42
AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published 29 days ago • 57
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Paper • 2410.12787 • Published Oct 16 • 30
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2 • 25

MultimodalEmbeddings

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91
Pixtral 12B

Paper • 2410.07073 • Published Oct 9 • 60

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1 • 144
Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91
An accurate detection is not all you need to combat label noise in web-noisy datasets

Paper • 2407.05528 • Published Jul 8 • 3
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP

Paper • 2407.00402 • Published Jun 29 • 22

📑Trending Papers - September 9⃣️

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18 • 136
Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5 • 87
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Paper • 2409.02634 • Published Sep 4 • 89
OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17 • 107

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27 • 91
Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published Oct 17 • 29

Previous
1
2
3
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs