Spaces:
Sleeping
Sleeping
Apply for community grant: Academic project (gpu)
#1
by
maxin-cn
- opened
Latte is a latent diffusion transformer proposed as a backbone for modeling different modalities (trained for text-to-video generation here). It achieves state-of-the-art performance across four standard video benchmarks - FaceForensics, SkyTimelapse, UCF101, and Taichi-HD.