File size: 1,186 Bytes
732f851 06bee35 732f851 06bee35 8ab04db 583db6c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
license: mit
tags:
- stable-diffusion
- stable-diffusion-diffusers
inference: false
---
# SDXL-VAE-FP16-Fix
SDXL-VAE-FP16-Fix is the [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae), but modified to run in fp16 precision without generating NaNs.
| VAE | Decoding in `float32` / `bfloat16` precision | Decoding in `float16` precision |
| --------------------- | -------------------------------------------- | ------------------------------- |
| SDXL-VAE | ✅ ![](./images/orig-fp32.png) | ⚠️ ![](./images/orig-fp16.png) |
| SDXL-VAE-FP16-Fix | ✅ ![](./images/fix-fp32.png) | ✅ ![](./images/fix-fp16.png) |
## Details
SDXL-VAE generates NaNs in fp16 because the internal activation values are too big:
![](./images/activation-magnitudes.jpg)
SDXL-VAE-FP16-Fix was created by finetuning the SDXL-VAE to:
1. keep the final output the same, but
2. make the internal activation values smaller, by
3. scaling down weights and biases within the network
There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes. |