|
--- |
|
tags: |
|
- text-to-image |
|
- layout-to-image |
|
- stable-diffusion |
|
- controlnet |
|
license: agpl-3.0 |
|
language: |
|
- en |
|
--- |
|
|
|
<h1 style="font-size:1.5em; " align="center"> Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024) </h1> |
|
|
|
<div align="center"> |
|
|
|
[**Project Page**](https://yumengli007.github.io/ALDM/) **|** [**ArXiv**](https://arxiv.org/abs/2401.08815) **|** [**Code**](https://github.com/boschresearch/ALDM) |
|
</div> |
|
|
|
<div align="center"> |
|
This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in <a href="https://yumengli007.github.io/ALDM/">ALDM</a>. |
|
For usage instructions, please refer to our <a href="https://github.com/boschresearch/ALDM">Github</a>. |
|
</div align="center"> |
|
|
|
## Model information |
|
[ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference. |
|
|
|
[encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth) |
|
are segmentation models used for disciminator intialization in training, |
|
which are adopted from pretrained weights from [here](https://github.com/CSAILVision/semantic-segmentation-pytorch). |
|
|
|
|
|
|