madebyollin
/

sdxl-vae-fp16-fix

stable-diffusion

stable-diffusion-diffusers

Model card Files Files and versions Community

madebyollin commited on Jul 11, 2023

Commit

583db6c

•

1 Parent(s): 8a11549

Update README.md

Files changed (1) hide show

README.md +17 -4

README.md CHANGED Viewed

@@ -9,7 +9,20 @@ inference: false
 SDXL-VAE-FP16-Fix is the [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae), but modified to run in fp16 precision without generating NaNs.
-| VAE                   | Decoding in `float32` precision | Decoding in `float16` precision |
-| --------------------- | ------------------------------- | ------------------------------- |
-| SDXL-VAE              | ✅ ![](./images/orig-fp32.png)  | ⚠️ ![](./images/orig-fp16.png)  |
-| SDXL-VAE-FP16-Fix     | ✅ ![](./images/fix-fp32.png)   | ✅ ![](./images/fix-fp16.png)   |

 SDXL-VAE-FP16-Fix is the [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae), but modified to run in fp16 precision without generating NaNs.
+| VAE                   | Decoding in `float32` / `bfloat16` precision | Decoding in `float16` precision |
+| --------------------- | -------------------------------------------- | ------------------------------- |
+| SDXL-VAE              | ✅ ![](./images/orig-fp32.png)              | ⚠️ ![](./images/orig-fp16.png)  |
+| SDXL-VAE-FP16-Fix     | ✅ ![](./images/fix-fp32.png)               | ✅ ![](./images/fix-fp16.png)   |
+## Details
+SDXL-VAE generates NaNs in fp16 because the internal activation values are too big:
+![](./images/activation-magnitudes.jpg)
+SDXL-VAE-FP16-Fix was created by finetuning the SDXL-VAE to:
+1. keep the final output the same, but
+2. make the internal activation values smaller, by
+3. scaling down weights and biases within the network
+There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes.