Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.12576

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

For Content Creator

Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era

Paper • 2305.06131 • Published May 10, 2023 • 2
Perpetual Humanoid Control for Real-time Simulated Avatars

Paper • 2305.06456 • Published May 10, 2023 • 1
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

Paper • 2305.10973 • Published May 18, 2023 • 32
LDM3D: Latent Diffusion Model for 3D

Paper • 2305.10853 • Published May 18, 2023 • 10

Image-Gen Personalization

pOps: Photo-Inspired Diffusion Operators

Paper • 2406.01300 • Published Jun 3 • 16
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11 • 10
Interpreting the Weight Space of Customized Diffusion Models

Paper • 2406.09413 • Published Jun 13 • 18
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13 • 13

Video/Image/Gif/etc.

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27 • 88
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Paper • 2402.17485 • Published Feb 27 • 188
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1 • 44
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Paper • 2403.04692 • Published Mar 7 • 40

about 10 hours ago

ibm/AttaQ

Viewer • Updated Jan 26 • 1.4k • 935 • 11
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11 • 108 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29 • 96.4k • 1.48k • 24
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 270 • 66

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17 • 8
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18 • 15
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19 • 58
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24 • 72

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs