ALDM

File size: 1,290 Bytes

---
tags:
- text-to-image
- layout-to-image
- stable-diffusion
- controlnet
license: agpl-3.0
language:
- en
---

<h1 style="font-size:1.5em; " align="center"> Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024) </h1>

<div align="center"> 
  
  [**Project Page**](https://yumengli007.github.io/ALDM/) **|** [**ArXiv**](https://arxiv.org/abs/2401.08815) **|** [**Code**](https://github.com/boschresearch/ALDM)
</div>

<div align="center">
This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in  <a href="https://yumengli007.github.io/ALDM/">ALDM</a>. 
For usage instructions, please refer to our <a href="https://github.com/boschresearch/ALDM">Github</a>. 
</div align="center">

## Model information   
[ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference.

[encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth) 
are segmentation models used for discriminator intialization in training, 
which are adopted from pretrained UperNet101 [here](https://github.com/CSAILVision/semantic-segmentation-pytorch).