-
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Paper • 2407.15841 • Published • 39 -
Stable Audio Open
Paper • 2407.14358 • Published • 23 -
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper • 2407.13976 • Published • 5 -
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Paper • 2407.14329 • Published • 4
Collections
Discover the best community collections!
Collections including paper arxiv:2407.13976
-
YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals
Paper • 2406.16273 • Published • 40 -
Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images
Paper • 2407.06191 • Published • 10 -
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
Paper • 2407.13976 • Published • 5
-
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
Paper • 2403.01807 • Published • 7 -
TripoSR: Fast 3D Object Reconstruction from a Single Image
Paper • 2403.02151 • Published • 12 -
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Paper • 2403.01779 • Published • 28 -
MagicClay: Sculpting Meshes With Generative Neural Fields
Paper • 2403.02460 • Published • 6
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 9 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 12 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 21 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 6