Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.02629

Video-Gen Dataset

VidGen-1M: A Large-Scale Dataset for Text-to-video Generation

Paper • 2408.02629 • Published Aug 5 • 13
Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Paper • 2408.03695 • Published Aug 7 • 12

about 23 hours ago

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Paper • 2405.07526 • Published May 13 • 17
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

Paper • 2405.15613 • Published May 24 • 13
A Touch, Vision, and Language Dataset for Multimodal Alignment

Paper • 2402.13232 • Published Feb 20 • 13
How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17 • 30

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 15
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18 • 8
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18 • 10
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 13

video generation

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation

Paper • 2312.04483 • Published Dec 7, 2023 • 7
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators

Paper • 2312.03793 • Published Dec 6, 2023 • 17
Photorealistic Video Generation with Diffusion Models

Paper • 2312.06662 • Published Dec 11, 2023 • 23
PEEKABOO: Interactive Video Generation via Masked-Diffusion

Paper • 2312.07509 • Published Dec 12, 2023 • 7

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

Paper • 2311.10709 • Published Nov 17, 2023 • 24
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Paper • 2405.12970 • Published May 21 • 22
FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published May 19 • 53
stabilityai/stable-diffusion-3-medium

Text-to-Image • Updated Aug 12 • 60k • 4.56k

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 18
Noise-Aware Training of Layout-Aware Language Models

Paper • 2404.00488 • Published Mar 30 • 7
Streaming Dense Video Captioning

Paper • 2404.01297 • Published Apr 1 • 11

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs