ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed

Published in

SyncedReview

3 min readMar 21, 2024

In recent times, video generative models have emerged as a focal point of attention, unlocking a realm of fresh creative opportunities. Despite this, the velocity of these models remains a significant obstacle to their broader adoption. State-of-the-art generative models, while impressive in their capabilities, are hampered by sluggishness and computational demands due to their iterative diffusion processes.

To address this issue, in a new paper AnimateDiff-Lightning: Cross-Model Diffusion Distillation, a ByteDance research team presents AnimateDiff-Lightning, a novel approach that utilizes progressive adversarial diffusion distillation, catapulting video generation into a realm of lightning-fast performance while simultaneously achieving unprecedented results in few-step video generation.

Diffusion distillation has seen extensive exploration in image generation, and the progressive adversarial diffusion distillation achieves state-of-the-art results in few-step image generation. Yet, research into video diffusion distillation has remained relatively scarce until now.

In this work, the researchers introduce progressive adversarial diffusion distillation to video models for the first time. Their methodology involves the simultaneous and explicit distillation of a shared motion module across different base models, leading to AnimateDiff’s enhanced compatibility with various base models in few-step inference scenarios.

Furthermore, the team devises a strategy of assigning distinct distillation datasets tailored to each image base model. For instance, in distilling each realistic or anime model, they aggregate all generated data of the respective kind to bolster diversity.

Through empirical analysis, AnimateDiff-Lightning is pitted against the original AnimateDiff and AnimateLCM. The results are striking: AnimateDiff-Lightning produces higher-quality videos in fewer inference steps, outperforming the previous video distillation method, AnimateLCM. Additionally, through cross-model distillation, AnimateDiff-Lightning adeptly preserves the original style of the base model.

In essence, this work demonstrates the applicability of progressive adversarial diffusion distillation to the video domain. With AnimateDiff-Lightning setting a new benchmark in few-step video generation, the potential for rapid and high-quality video creation is significantly expanded.

The model is available on HuggingFace. The paper AnimateDiff-Lightning: Cross-Model Diffusion Distillation is on arXiv.

Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed

Written by Synced