Stable Diffusion 1.5 fine tuned VAE decoder for better pixel art generation by aliasing the output of the decoder.

comparison

Fine tuning was done by training 50 thousand images for 1 epoch effective batch size 12. I preprocessed the images to quantize each 8x8 tile to its average color. On a RTX3090, this took about 4 hours to fine-tune. Used only MSE loss at 1e-5 learning rate. The training data set was just generated from other stable diffusion models, mostly cartoon-like images.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .