Collections
Discover the best community collections!
Collections including paper arxiv:2408.02629
-
MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels
Paper • 2405.07526 • Published • 17 -
Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach
Paper • 2405.15613 • Published • 13 -
A Touch, Vision, and Language Dataset for Multimodal Alignment
Paper • 2402.13232 • Published • 13 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 30
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 15 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 8 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Paper • 2312.04483 • Published • 7 -
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Paper • 2312.03793 • Published • 17 -
Photorealistic Video Generation with Diffusion Models
Paper • 2312.06662 • Published • 23 -
PEEKABOO: Interactive Video Generation via Masked-Diffusion
Paper • 2312.07509 • Published • 7
-
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
Paper • 2311.10709 • Published • 24 -
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control
Paper • 2405.12970 • Published • 22 -
FIFO-Diffusion: Generating Infinite Videos from Text without Training
Paper • 2405.11473 • Published • 53 -
stabilityai/stable-diffusion-3-medium
Text-to-Image • Updated • 60k • 4.56k
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 18 -
Noise-Aware Training of Layout-Aware Language Models
Paper • 2404.00488 • Published • 7 -
Streaming Dense Video Captioning
Paper • 2404.01297 • Published • 11