PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published 21 days ago • 53
Improving Long-Text Alignment for Text-to-Image Diffusion Models Paper • 2410.11817 • Published 23 days ago • 14
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression Paper • 2410.08584 • Published 28 days ago • 11
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation Paper • 2410.09009 • Published 27 days ago • 13