arxiv:2412.01506

Structured 3D Latents for Scalable and Versatile 3D Generation

Published on Dec 2

· Submitted by

JeffreyXiang on Dec 6

Upvote

Authors:

Jianfeng Xiang ,

Zelong Lv ,

Sicheng Xu ,

Yu Deng ,

Ruicheng Wang ,

Bowen Zhang ,

Jiaolong Yang

Abstract

We introduce a novel 3D generation method for versatile and high-quality 3D asset creation. The cornerstone is a unified Structured LATent (SLAT) representation which allows decoding to different output formats, such as Radiance Fields, 3D Gaussians, and meshes. This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model, comprehensively capturing both structural (geometry) and textural (appearance) information while maintaining flexibility during decoding. We employ rectified flow transformers tailored for SLAT as our 3D generation models and train models with up to 2 billion parameters on a large 3D asset dataset of 500K diverse objects. Our model generates high-quality results with text or image conditions, significantly surpassing existing methods, including recent ones at similar scales. We showcase flexible output format selection and local 3D editing capabilities which were not offered by previous models. Code, model, and data will be released.

View arXiv page View PDF Add to collection

Community

JeffreyXiang

Paper author Paper submitter 21 days ago

TRELLIS is a large 3D asset generation model. It takes in text or image prompts and generates high-quality 3D assets in various formats, such as Radiance Fields, 3D Gaussians, and meshes. The cornerstone of TRELLIS is a unified Structured LATent (SLAT) representation that allows decoding to different output formats and Rectified Flow Transformers tailored for SLAT as the powerful backbones. We provide large-scale pre-trained models with up to 2 billion parameters on a large 3D asset dataset of 500K diverse objects. TRELLIS significantly surpasses existing methods, including recent ones at similar scales, and showcases flexible output format selection and local 3D editing capabilities which were not offered by previous models.

Project Page: https://trellis3d.github.io
Code: https://github.com/Microsoft/TRELLIS
Demo: https://huggingface.co/spaces/JeffreyXiang/TRELLIS