Papers
arxiv:2411.02397

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Published on Nov 4
· Submitted by kumarak on Nov 5
Authors:
,
,
,
,
,

Abstract

Generating temporally-consistent high-fidelity videos can be computationally expensive, especially over longer temporal spans. More-recent Diffusion Transformers (DiTs) -- despite making significant headway in this context -- have only heightened such challenges as they rely on larger models and heavier attention mechanisms, resulting in slower inference speeds. In this paper, we introduce a training-free method to accelerate video DiTs, termed Adaptive Caching (AdaCache), which is motivated by the fact that "not all videos are created equal": meaning, some videos require fewer denoising steps to attain a reasonable quality than others. Building on this, we not only cache computations through the diffusion process, but also devise a caching schedule tailored to each video generation, maximizing the quality-latency trade-off. We further introduce a Motion Regularization (MoReg) scheme to utilize video information within AdaCache, essentially controlling the compute allocation based on motion content. Altogether, our plug-and-play contributions grant significant inference speedups (e.g. up to 4.7x on Open-Sora 720p - 2s video generation) without sacrificing the generation quality, across multiple video DiT baselines.

Community

Paper author Paper submitter
edited 2 days ago

We introduce Adaptive Caching for Faster Video Generation with Diffusion Transformers.
project-page: https://adacache-dit.github.io/ (works better on Chrome)
code: https://github.com/AdaCache-DiT/AdaCache
arxiv: https://arxiv.org/abs/2411.02397

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2411.02397 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2411.02397 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2411.02397 in a Space README.md to link it from this page.

Collections including this paper 6