Scaling Properties of Diffusion Models for Perceptual Tasks Paper • 2411.08034 • Published 16 days ago • 13
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Paper • 2409.12961 • Published Sep 19 • 24
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation Paper • 2409.12576 • Published Sep 19 • 15
Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking Paper • 2312.09244 • Published Dec 14, 2023 • 7
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance Paper • 2312.08889 • Published Dec 13, 2023 • 11
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds Paper • 2312.09246 • Published Dec 14, 2023 • 5
CogAgent: A Visual Language Model for GUI Agents Paper • 2312.08914 • Published Dec 14, 2023 • 29
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Paper • 2312.02087 • Published Dec 4, 2023 • 20