Residual Flows for Invertible Generative Modeling (1906.02735v6)

Published 6 Jun 2019 in stat.ML and cs.LG

Abstract: Flow-based generative models parameterize probability distributions through an invertible transformation and can be trained by maximum likelihood. Invertible residual networks provide a flexible family of transformations where only Lipschitz conditions rather than strict architectural constraints are needed for enforcing invertibility. However, prior work trained invertible residual networks for density estimation by relying on biased log-density estimates whose bias increased with the network's expressiveness. We give a tractable unbiased estimate of the log density using a "Russian roulette" estimator, and reduce the memory required during training by using an alternative infinite series for the gradient. Furthermore, we improve invertible residual blocks by proposing the use of activation functions that avoid derivative saturation and generalizing the Lipschitz condition to induced mixed norms. The resulting approach, called Residual Flows, achieves state-of-the-art performance on density estimation amongst flow-based models, and outperforms networks that use coupling blocks at joint generative and discriminative modeling.

Authors (4)

Ricky T. Q. Chen (53 papers)
Jens Behrmann (14 papers)
David Duvenaud (65 papers)
Jörn-Henrik Jacobsen (24 papers)

Citations (357)

View on Semantic Scholar

Summary

An Expert Review of "Residual Flows for Invertible Generative Modeling"

In the paper of probabilistic generative models, invertible or flow-based models have been notably impactful due to their tractable design and ability to transform data distributions using invertible transformations. The paper "Residual Flows for Invertible Generative Modeling" presents a significant advancement in this domain by developing a novel type of flow-based model named Residual Flows. This approach leverages invertible residual networks (i-ResNets) and addresses previous limitations concerning biased log-density estimates that arose from truncation in density evaluation.

Key Contributions and Methods

The core contribution of this work is the introduction of an unbiased estimator for the log-density using a "Russian roulette" estimator. This innovative approach allows for a tractable computation of the infinite series derived from Jacobians of the residual blocks, eliminating the bias introduced by fixed truncation methods previously employed in i-ResNets. This unbiased estimator forms the basis for performing more accurate maximum likelihood estimation (MLE), as it directly complements the gradient computation for training these generative models.

Furthermore, the paper introduces refinements to the invertible residual blocks by integrating activation functions designed to circumvent derivative saturation and generalized constraints on Lipschitz constants. By transitioning from strict architectural constraints to induced mixed norms, this approach broadens the flexibility and expressiveness of the transformations within the residual blocks.

Experimental Evaluation and Results

The paper demonstrates the competitive performance of Residual Flows against existing state-of-the-art flow-based models on high-dimensional data through extensive experimentation. For instance, on popular benchmarks such as CIFAR-10 and CelebA-HQ, Residual Flows achieve superior or comparable results in terms of bits per dimension (BPD), highlighting the efficacy of unbiased estimation in density modeling. The experimental apparatus includes a robust architecture of residual blocks with memory-efficient backpropagation, facilitating scalability and tractable deployment without sacrificing performance.

Of particular note is the improvement in sample quality, as assessed by metrics such as the Fréchet Inception Distance (FID), where Residual Flows deliver better or equivalent results than architectures relying on coupling layers or autoregressive models. This implies a refined representational capacity of the model in learning complex data distributions while maintaining its invertibility constraints.

Implications and Future Directions

Residual Flows represent a significant enhancement in the field of flow-based generative modeling, combining the strengths of invertible architectures with unbiased computational techniques. The introduction of Lipschitz-constrained activation functions suggests a potentially promising direction for further augmenting neural network designs that prioritize both stability and expressiveness.

This work opens avenues for future research to explore the integration of Residual Flows within hybrid modeling scenarios, especially in semi-supervised learning and tasks necessitating joint generative and discriminative capabilities. Additionally, the notion of using generalized induced norms offers a rich framework for designing architectures tailored to specific data characteristics, adapting beyond the context of spectral norms.

In summary, by addressing the limitations associated with prior biased estimation methods and incorporating memory-efficient computational strategies, this paper positions Residual Flows at the forefront of flow-based generative modeling advancements. It paves the way for further explorations into scalable, unbiased, and expressive generative networks capable of learning and modeling intricate data distributions across diverse applications in both generative and discriminative domains.

PDF Markdown

Related Papers

Invertible Residual Networks (2018)
Invertible Monotone Operators for Normalizing Flows (2022)
Graphical Residual Flows (2022)
Invertible DenseNets with Concatenated LipSwish (2021)
Invertible DenseNets (2020)