FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
Abstract
Existing sparse-view reconstruction models heavily rely on accurate known camera poses. However, deriving camera extrinsics and intrinsics from sparse-view images presents significant challenges. In this work, we present FreeSplatter, a highly scalable, feed-forward reconstruction framework capable of generating high-quality 3D Gaussians from uncalibrated sparse-view images and recovering their camera parameters in mere seconds. FreeSplatter is built upon a streamlined transformer architecture, comprising sequential self-attention blocks that facilitate information exchange among multi-view image tokens and decode them into pixel-wise 3D Gaussian primitives. The predicted Gaussian primitives are situated in a unified reference frame, allowing for high-fidelity 3D modeling and instant camera parameter estimation using off-the-shelf solvers. To cater to both object-centric and scene-level reconstruction, we train two model variants of FreeSplatter on extensive datasets. In both scenarios, FreeSplatter outperforms state-of-the-art baselines in terms of reconstruction quality and pose estimation accuracy. Furthermore, we showcase FreeSplatter's potential in enhancing the productivity of downstream applications, such as text/image-to-3D content creation.
Community
Project page: https://bluestyle97.github.io/projects/freesplatter/
Github: https://github.com/TencentARC/FreeSplatter
Huggingface demo: https://huggingface.co/spaces/TencentARC/FreeSplatter
Model weights: https://huggingface.co/huanngzh/mv-adapter
That's incredible! How much VRAM is needed for inference?
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds (2024)
- Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis (2024)
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images (2024)
- SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting (2024)
- SmileSplat: Generalizable Gaussian Splats for Unconstrained Sparse Images (2024)
- PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence (2024)
- PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper