Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accelerating Parallel Sampling of Diffusion Models (2402.09970v2)

Published 15 Feb 2024 in cs.LG and stat.ML

Abstract: Diffusion models have emerged as state-of-the-art generative models for image generation. However, sampling from diffusion models is usually time-consuming due to the inherent autoregressive nature of their sampling process. In this work, we propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Specifically, we reformulate the sampling process as solving a system of triangular nonlinear equations through fixed-point iteration. With this innovative formulation, we explore several systematic techniques to further reduce the iteration steps required by the solving process. Applying these techniques, we introduce ParaTAA, a universal and training-free parallel sampling algorithm that can leverage extra computational and memory resources to increase the sampling speed. Our experiments demonstrate that ParaTAA can decrease the inference steps required by common sequential sampling algorithms such as DDIM and DDPM by a factor of 4$\sim$14 times. Notably, when applying ParaTAA with 100 steps DDIM for Stable Diffusion, a widely-used text-to-image diffusion model, it can produce the same images as the sequential sampling in only 7 inference steps. The code is available at https://github.com/TZW1998/ParaTAA-Diffusion.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhiwei Tang (9 papers)
  2. Jiasheng Tang (16 papers)
  3. Hao Luo (114 papers)
  4. Fan Wang (313 papers)
  5. Tsung-Hui Chang (87 papers)
Citations (6)

Summary

  • The paper presents a novel reformulation of the sampling process as solving triangular nonlinear equations via parallel fixed-point iteration.
  • It introduces Triangular Anderson Acceleration (TAA) that leverages previous iterations to substantially reduce convergence steps and computational overhead.
  • Empirical results on DiT and Stable Diffusion models show that the ParaTAA algorithm accelerates image generation while maintaining high quality.

Accelerating the Sampling Process in Diffusion Models through Parallelization

Introduction

Diffusion models have recently taken center stage in the field of generative AI due to their superior capability in producing high-quality images. However, the extensive computational demands of these models, particularly in their sampling process, have posed significant challenges. The conventional autoregressive nature of this process makes it time-consuming, which restricts the practical deployment and iterative exploration of generated outputs.

Accelerating Sampling via Parallelization

In addressing the inefficiencies of the sampling process, recent efforts have specifically targeted the acceleration of this procedure. This work introduces a novel approach that reframes the sampling process as solving a system of triangular nonlinear equations using fixed-point iteration. This reformulation is pivotal as it allows the application of parallel computing techniques to significantly reduce the computational time without compromising the generated image quality.

Triangular Anderson Acceleration (TAA)

A standout contribution of this paper is the development of what is termed as Triangular Anderson Acceleration (TAA). This technique stems from the classical Anderson Acceleration method, modified to cater specifically to the triangular structure of the nonlinear equations underlying the sampling process. By leveraging information from previous iterations more effectively, TAA substantially enhances the efficiency of fixed-point iteration, demonstrating a remarkable reduction in the number of steps required to converge to a high-quality image.

Practical Enhancements

The paper further explores two practical enhancements: early stopping and initialization from existing trajectories. Early stopping capitalizes on the observation that high-quality images often materialize before the convergence criteria are met, enabling a significant reduction in sampling steps. On the other hand, initializing the sampling process with a trajectory from a similar input condition has shown to facilitate even faster convergence, presenting a valuable strategy for iterative image generation tasks.

Empirical Validation

The efficacy of the proposed approach is empirically validated across multiple scenarios, including sampling with the DiT and Stable Diffusion models. The findings reveal that the ParaTAA algorithm, incorporating both TAA and the practical enhancements, substantially outperforms both the naive fixed-point iteration and its variant with an optimized order of nonlinear equations. This is quantitatively supported by metrics such as FID, Inception Score, and CLIP Score, along with qualitative examples illustrating the accelerated convergence towards high-quality images.

Implications and Future Directions

This research not only presents a significant advancement in the practical application of diffusion models but also opens up new avenues for exploration in parallel computing techniques within the field of generative AI. The foundational concept of accelerating sampling through parallelization holds potential for adaptation and further innovation, particularly in extending its application to other domains such as video generation and beyond.

In conclusion, the proposed parallel sampling strategy signifies a pivotal step towards overcoming the computational barriers associated with diffusion models, thereby paving the way for more efficient and versatile generative AI applications. The remarkable reduction in sampling time, coupled with the preservation of image quality, underscores the foundational advancements that parallel computation and algorithmic refinements can deliver in the ongoing evolution of generative modeling techniques.