File size: 2,092 Bytes

---
license: mit
---
<div align="center">

# StableV2V: Stablizing Shape Consistency in Video-to-Video Editing

Chang Liu, Rui Li, Kaidong Zhang, Yunwei Lan, Dong Liu

[[`Paper`]](https://arxiv.org/abs/2411.11045) / [[`Project`]](https://alonzoleeeooo.github.io/StableV2V/) / [[`GitHub`]](https://github.com/AlonzoLeeeooo/StableV2V) / [[`DAVIS-Edit (HuggingFace)`]](https://huggingface.co/datasets/AlonzoLeeeooo/DAVIS-Edit) / [[`Models (wisemodel)`]](https://wisemodel.cn/models/Alonzo/StableV2V) / [[`DAVIS-Edit (wisemodel)`]](https://wisemodel.cn/datasets/Alonzo/DAVIS-Edit) / [[`Models (ModelScope)`]](https://modelscope.cn/models/AlonzoLeeeoooo/StableV2V) / [[`DAVIS-Edit (ModelScope)`]](https://modelscope.cn/datasets/AlonzoLeeeoooo/DAVIS-Edit)
</div>

Official pre-trained model weights of the paper titled "StableV2V: Stablizing Shape Consistency in Video-to-Video Editing".

# Model Weights Structure
We construct our model weights following the structure below:
```
StableV2V
├── controlnet-depth               <----- ControlNet (depth), required by CIG
├── controlnet-scribble            <----- ControlNet (scribble, needed in sketch-based editing application)
├── ctrl-adapter-i2vgenxl-depth    <----- Ctrl-Adapter (I2VGen-XL, depth), required by CIG
├── i2vgenxl                       <----- I2VGen-XL, required by CIG
├── instruct-pix2pix               <----- InstructPix2Pix, required by PFE
├── paint-by-example               <----- Paint-by-Example, required by PFE
├── stable-diffusion-v1-5-inpaint  <----- SD Inpaint, required by PFE
├── stable-diffusion-v1.5          <----- SD v1.5, required by CIG
├── 50000.ckpt                     <----- Shape-guided depth refinement network
├── README.md
├── dpt_swin2_large_384.pt         <----- MiDaS, required by ISA
├── raft-things.pth                <----- RAFT, required by ISA
├── u2net.pth                      <----- U2-net, required by ISA
└── 50000.ckpt                     <----- Shape-guided depth refinement network, required by ISA
```