File size: 4,258 Bytes
5703f94 4fb2d86 5703f94 1668012 5703f94 aa7f989 476f680 838ac75 5703f94 35dc5c4 3863dad 35dc5c4 baf3ffd 35dc5c4 3863dad 35dc5c4 5703f94 09fc719 5703f94 1668012 5703f94 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
---
license: cc-by-nc-4.0
library_name: diffusers
tags:
- text-to-image
- stable-diffusion
- diffusion distillation
---
# DMD2 Model Card
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/63363b864067f020756275b7/YhssMfS_1e6q5fHKh9qrc.jpeg)
> [**Improved Distribution Matching Distillation for Fast Image Synthesis**](https://arxiv.org/abs/2405.14867),
> Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman
## Contact
Feel free to contact us if you have any questions about the paper!
Tianwei Yin [tianweiy@mit.edu](mailto:tianweiy@mit.edu)
## Huggingface Demo
Our 4-step (much higher quality, 2X slower) Text-to-Image demo is hosted at [DMD2-4step](https://6cf215173601f32482.gradio.live)
Our 1-step Text-to-Image demo is hosted at [DMD2-1step](https://cc2622c0c132346c64.gradio.live)
## Usage
We can use the standard diffuser pipeline:
#### 4-step generation
```.bash
import torch
from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo_name = "tianweiy/DMD2"
ckpt_name = "dmd2_sdxl_4step_unet_fp16.bin"
# Load model.
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(torch.load(hf_hub_download(repo_name, ckpt_name), map_location="cuda"))
pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
prompt="a photo of a cat"
# LCMScheduler's default timesteps are different from the one we used for training
image=pipe(prompt=prompt, num_inference_steps=4, guidance_scale=0, timesteps=[999, 749, 499, 249]).images[0]
```
#### 1-step generation
```.bash
import torch
from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo_name = "tianweiy/DMD2"
ckpt_name = "dmd2_sdxl_1step_unet_fp16.bin"
# Load model.
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(torch.load(hf_hub_download(repo_name, ckpt_name), map_location="cuda"))
pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[399]).images[0]
```
For more information, please refer to the [code repository](https://github.com/tianweiy/DMD2)
## License
Improved Distribution Matching Distillation is released under [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en).
## Citation
If you find DMD2 useful or relevant to your research, please kindly cite our papers:
```bib
@article{yin2024improved,
title={Improved Distribution Matching Distillation for Fast Image Synthesis},
author={Yin, Tianwei and Gharbi, Micha{\"e}l and Park, Taesung and Zhang, Richard and Shechtman, Eli and Durand, Fredo and Freeman, William T},
journal={arXiv:2405.14867},
year={2024}
}
@inproceedings{yin2024onestep,
title={One-step Diffusion with Distribution Matching Distillation},
author={Yin, Tianwei and Gharbi, Micha{\"e}l and Zhang, Richard and Shechtman, Eli and Durand, Fr{\'e}do and Freeman, William T and Park, Taesung},
booktitle={CVPR},
year={2024}
}
```
## Acknowledgments
This work was done while Tianwei Yin was a full-time student at MIT. It was developed based on our reimplementation of the original DMD paper. This work was supported by the National Science Foundation under Cooperative Agreement PHY-2019786 (The NSF AI Institute for Artificial Intelligence and Fundamental Interactions, http://iaifi.org/), by NSF Grant 2105819, by NSF CISE award 1955864, and by funding from Google, GIST, Amazon, and Quanta Computer. |