metadata

license: mit
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
inference: false

SDXL-VAE-FP16-Fix

SDXL-VAE-FP16-Fix is the SDXL VAE, but modified to run in fp16 precision without generating NaNs.

VAE	Decoding in `float32` / `bfloat16` precision	Decoding in `float16` precision
SDXL-VAE	✅	⚠️
SDXL-VAE-FP16-Fix	✅	✅

Details

SDXL-VAE generates NaNs in fp16 because the internal activation values are too big:

SDXL-VAE-FP16-Fix was created by finetuning the SDXL-VAE to:

There are slight discrepancies between the output of SDXL-VAE-FP16-Fix and SDXL-VAE, but the decoded images should be close enough for most purposes.