Trajectory Attention for Fine-grained Video Motion Control
Abstract
Recent advancements in video generation have been greatly driven by video diffusion models, with camera motion control emerging as a crucial challenge in creating view-customized visual content. This paper introduces trajectory attention, a novel approach that performs attention along available pixel trajectories for fine-grained camera motion control. Unlike existing methods that often yield imprecise outputs or neglect temporal correlations, our approach possesses a stronger inductive bias that seamlessly injects trajectory information into the video generation process. Importantly, our approach models trajectory attention as an auxiliary branch alongside traditional temporal attention. This design enables the original temporal attention and the trajectory attention to work in synergy, ensuring both precise motion control and new content generation capability, which is critical when the trajectory is only partially available. Experiments on camera motion control for images and videos demonstrate significant improvements in precision and long-range consistency while maintaining high-quality generation. Furthermore, we show that our approach can be extended to other video motion control tasks, such as first-frame-guided video editing, where it excels in maintaining content consistency over large spatial and temporal ranges.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Optical-Flow Guided Prompt Optimization for Coherent Video Generation (2024)
- ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning (2024)
- MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control (2024)
- Boosting Camera Motion Control for Video Diffusion Transformers (2024)
- DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control (2024)
- Motion Control for Enhanced Complex Action Video Generation (2024)
- AnimateAnything: Consistent and Controllable Animation for Video Generation (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper