Collections
Discover the best community collections!
Collections including paper arxiv:2406.09414
-
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper • 2406.11839 • Published • 37 -
Pandora: Towards General World Model with Natural Language Actions and Video States
Paper • 2406.09455 • Published • 14 -
WPO: Enhancing RLHF with Weighted Preference Optimization
Paper • 2406.11827 • Published • 14 -
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Paper • 2406.11194 • Published • 15
-
Depth Anything V2
Paper • 2406.09414 • Published • 92 -
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation
Paper • 2406.12849 • Published • 49 -
BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation
Paper • 2407.17952 • Published • 29 -
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Paper • 2409.18124 • Published • 31
-
Depth Anything V2
Paper • 2406.09414 • Published • 92 -
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 62 -
Sapiens: Foundation for Human Vision Models
Paper • 2408.12569 • Published • 88 -
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise
Paper • 2410.03017 • Published • 25
-
Depth Anything V2
Paper • 2406.09414 • Published • 92 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 50 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 34 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 107