## Environment Setup `pip install -r requirements.txt` ## Download checkpoints 1. Download the pretrained checkpoints of [SVD_xt](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1) from huggingface to `./ckpts`. 2. Download the checkpint of [MOFA-Adapter](https://huggingface.co/MyNiuuu/MOFA-Video-Traj) from huggingface to `./ckpts`. The final structure of checkpoints should be: ```text ./ckpts/ |-- controlnet | |-- config.json | `-- diffusion_pytorch_model.safetensors |-- stable-video-diffusion-img2vid-xt-1-1 | |-- feature_extractor | |-- ... | |-- image_encoder | |-- ... | |-- scheduler | |-- ... | |-- unet | |-- ... | |-- vae | |-- ... | |-- svd_xt_1_1.safetensors | `-- model_index.json ``` ## Run Gradio Demo `python run_gradio.py` Please refer to the instructions on the gradio interface during the inference process. ## Paper arxiv.org/abs/2405.20222