DeepPCR: Parallelizing Sequential Operations in Neural Networks (2309.16318v2)

Published 28 Sep 2023 in cs.LG

Abstract: Parallelization techniques have become ubiquitous for accelerating inference and training of deep neural networks. Despite this, several operations are still performed in a sequential manner. For instance, the forward and backward passes are executed layer-by-layer, and the output of diffusion models is produced by applying a sequence of denoising steps. This sequential approach results in a computational cost proportional to the number of steps involved, presenting a potential bottleneck as the number of steps increases. In this work, we introduce DeepPCR, a novel algorithm which parallelizes typically sequential operations in order to speed up inference and training of neural networks. DeepPCR is based on interpreting a sequence of $L$ steps as the solution of a specific system of equations, which we recover using the Parallel Cyclic Reduction algorithm. This reduces the complexity of computing the sequential operations from $\mathcal{O}(L)$ to $\mathcal{O}(\log_2L)$, thus yielding a speedup for large $L$. To verify the theoretical lower complexity of the algorithm, and to identify regimes for speedup, we test the effectiveness of DeepPCR in parallelizing the forward and backward pass in multi-layer perceptrons, and reach speedups of up to $30\times$ for the forward and $200\times$ for the backward pass. We additionally showcase the flexibility of DeepPCR by parallelizing training of ResNets with as many as 1024 layers, and generation in diffusion models, enabling up to $7\times$ faster training and $11\times$ faster generation, respectively, when compared to the sequential approach.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/PentelEnergel/status/1834671501380829209

https://twitter.com/1647765276962816000/status/1733793071286935818

https://twitter.com/1267897288732614657/status/1736850424928297228

https://twitter.com/1373386751910281218/status/1738352906137440732

DeepPCR: Parallelizing Sequential Operations in Neural Networks (2309.16318v2)

Summary

Related Papers

Tweets