File size: 1,290 Bytes
d29d489
8ec2481
 
 
 
 
d29d489
8ec2481
 
d29d489
8ec2481
 
 
 
 
e3d157b
8ec2481
 
 
 
 
 
 
7c988da
 
 
 
28d213c
 
7c988da
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
tags:
- text-to-image
- layout-to-image
- stable-diffusion
- controlnet
license: agpl-3.0
language:
- en
---

<h1 style="font-size:1.5em; " align="center"> Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive (ICLR 2024) </h1>

<div align="center"> 
  
  [**Project Page**](https://yumengli007.github.io/ALDM/) **|** [**ArXiv**](https://arxiv.org/abs/2401.08815) **|** [**Code**](https://github.com/boschresearch/ALDM)
</div>

<div align="center">
This model repo contains checkpoints trained on Cityscapes and ADE20K datasets using methods proposed in  <a href="https://yumengli007.github.io/ALDM/">ALDM</a>. 
For usage instructions, please refer to our <a href="https://github.com/boschresearch/ALDM">Github</a>. 
</div align="center">

## Model information   
[ade20k_step9.ckpt](ade20k_step9.ckpt) and [cityscapes_step9.ckpt](cityscapes_step9.ckpt) are pretrained diffusion model weights for inference.

[encoder_epoch_50.pth](encoder_epoch_50.pth), [decoder_epoch_50_20cls.pth](decoder_epoch_50_20cls.pth) and [decoder_epoch_50_151cls.pth](decoder_epoch_50_151cls.pth) 
are segmentation models used for discriminator intialization in training, 
which are adopted from pretrained UperNet101 [here](https://github.com/CSAILVision/semantic-segmentation-pytorch).