Collections
Discover the best community collections!
Collections trending this week
-
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
Paper • 2302.01560 • Published -
Voyager: An Open-Ended Embodied Agent with Large Language Models
Paper • 2305.16291 • Published • 9 -
LASER: LLM Agent with State-Space Exploration for Web Navigation
Paper • 2309.08172 • Published • 11 -
A Data Source for Reasoning Embodied Agents
Paper • 2309.07974 • Published • 5
-
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Paper • 2309.03549 • Published • 5 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Paper • 2310.11440 • Published • 15 -
LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
Paper • 2310.10769 • Published • 8
-
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Paper • 2309.03550 • Published • 11 -
Text-Guided Generation and Editing of Compositional 3D Avatars
Paper • 2309.07125 • Published • 6 -
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
Paper • 2310.11784 • Published • 10 -
GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors
Paper • 2310.08529 • Published • 17
-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 13 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 10 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 9
-
Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts
Paper • 2309.04354 • Published • 13 -
Vision Transformers Need Registers
Paper • 2309.16588 • Published • 77 -
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
MotionLM: Multi-Agent Motion Forecasting as Language Modeling
Paper • 2309.16534 • Published • 15