--- tags: - text-to-image - layout-to-image - stable-diffusion - controlnet license: agpl-3.0 language: - en ---

Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024)

[**Project Page**](https://yumengli007.github.io/ALDM/) **|** [**ArXiv**](https://arxiv.org/abs/2401.08815) **|** [**Code**](https://github.com/boschresearch/ALDM)

This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in ALDM. For usage instructions, please refer to our Github.

## Model information [ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference. [encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth) are segmentation models used for discriminator intialization in training, which are adopted from pretrained UperNet101 [here](https://github.com/CSAILVision/semantic-segmentation-pytorch).