Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse (1911.02469v1)

Published 6 Nov 2019 in cs.LG and stat.ML

Abstract: Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to local maxima in the log marginal likelihood. Unexpectedly, we prove that the ELBO objective for the linear VAE does not introduce additional spurious local maxima relative to log marginal likelihood. We show further that training a linear VAE with exact variational inference recovers an identifiable global maximum corresponding to the principal component directions. Empirically, we find that our linear analysis is predictive even for high-capacity, non-linear VAEs and helps explain the relationship between the observation noise, local maxima, and posterior collapse in deep Gaussian VAEs.

Citations (165)

View on Semantic Scholar

Summary

The paper challenges the common belief that the ELBO objective directly causes posterior collapse in VAEs, demonstrating through linear models that the issue originates from the underlying marginal likelihood optimization itself.
Analytical results from linear VAEs show they find global optima corresponding to pPCA directions, confirming the ELBO does not introduce spurious local maxima responsible for collapse.
Findings from linear VAEs predict behavior in deep non-linear VAEs, suggesting solutions to posterior collapse should focus on model architecture and observation noise rather than just ELBO modifications.

An Analytical Perspective on Posterior Collapse in Variational Autoencoders

The paper "Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse" by Lucas et al. addresses a critical issue in the optimization of variational autoencoders (VAEs) known as posterior collapse. This phenomenon generally occurs when the variational posterior distribution closely aligns with the prior for a subset of latent variables, thus reducing the model's effective capacity. Previous work has predominantly attributed this issue to the Kullback-Leibler (KL) divergence component in the Evidence Lower Bound (ELBO) objective. The authors challenge this prevalent view by presenting a comprehensive analysis of linear VAEs and their relationship to Probabilistic Principal Component Analysis (pPCA).

Contributions and Methodology

The key contributions of the paper are:

Analytical Insights: The authors provide a straightforward explanation of posterior collapse using linear VAEs, demonstrating that the ELBO does not inherently introduce spurious local maxima beyond those present in the log marginal likelihood of pPCA.
Global Optimum Correlation: It is proven that training a linear VAE with exact variational inference identifies a global maximum that corresponds to the principal component directions. This holds even when considering the ELBO as an optimization target.
Predictive Empirical Analysis: The theoretical findings from linear VAEs are shown to hold predictive value even for high-capacity, non-linear VAEs, indicating that the linear analysis provides valuable insights into observation noise, local maxima, and posterior collapse in deep Gaussian VAEs.

Theoretical Framework

The work focuses on the fact that stationary points of the log marginal likelihood in pPCA can inherently cause posterior collapse, independent of the ELBO. The ELBO's role, traditionally perceived as the causative element for collapse due to its involvement with the KL divergence, is critically reevaluated. The authors establish that in the linear case, the ELBO does not exacerbate the problem beyond the inherent issues present in marginal likelihood optimization.

Numerical Results and Implications

The paper includes meticulous empirical evaluations:

Linear VAEs Experiments: The authors demonstrate that linear VAEs can recover the solutions of pPCA, emphasizing that the ELBO does not introduce additional local maxima. This grants a powerful benchmark that simplifies the theoretical paper of posterior collapse.
Deep Gaussian VAEs: Experiments with non-linear VAEs reveal that the insights obtained from the linear case extend to more complex models. The findings underpin how observation noise, when appropriately adjusted, significantly influences the minimization of posterior collapse.

Practical and Theoretical Implications

The work implies that solutions to posterior collapse should focus on addressing issues underlying the log marginal likelihood itself rather than solely attempting to modify the ELBO objective. This perspective shifts the emphasis towards investigating model architecture and observation noise to mitigate posterior collapse effectively.

Future Directions

The results necessitate further exploration into how these findings can enhance VAE structures and training regimens. Future research could explore model architectures that inherently manage variance and promote a robust latent space without dependency on kludge-like fixes.

In conclusion, the paper provides a nuanced and mathematically grounded view on a critical issue in VAE optimization. By dissecting the role of ELBO and demonstrating the foundational influence of marginal likelihood, it paves the way for more refined approaches to design VAEs that are resilient to posterior collapse.