arxiv:2401.00110

Diffusion Model with Perceptual Loss

Published on Dec 30, 2023

Upvote

Authors:

Shanchuan Lin ,

Xiao Yang

Abstract

Diffusion models trained with mean squared error loss tend to generate unrealistic samples. Current state-of-the-art models rely on classifier-free guidance to improve sample quality, yet its surprising effectiveness is not fully understood. In this paper, We show that the effectiveness of classifier-free guidance partly originates from it being a form of implicit perceptual guidance. As a result, we can directly incorporate perceptual loss in diffusion training to improve sample quality. Since the score matching objective used in diffusion training strongly resembles the denoising autoencoder objective used in unsupervised training of perceptual networks, the diffusion model itself is a perceptual network and can be used to generate meaningful perceptual loss. We propose a novel self-perceptual objective that results in diffusion models capable of generating more realistic samples. For conditional generation, our method only improves sample quality without entanglement with the conditional input and therefore does not sacrifice sample diversity. Our method can also improve sample quality for unconditional generation, which was not possible with classifier-free guidance before.

View arXiv page View PDF Add to collection

Community

froilo

Jan 2

wen code?

PeterL1n

Paper author Jan 19

wen code?

Code is attached in the paper as Algorithm 1.
Model is open sourced now on HuggingFace.

Snim

2 days ago

It would be nice to see qualitative results of self-perceptual with CFG. Also was there any attempt to use other bigger model for this perceptual signal. Was using both the self-perceptual and MSE loss, tried. Also were any alternatives to CFG tried to be used with Self perceptual?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2401.00110 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.