Likelihood-Free Variational Autoencoders (2504.17622v2)

Published 24 Apr 2025 in stat.ML and cs.LG

Abstract: Variational Autoencoders (VAEs) typically rely on a probabilistic decoder with a predefined likelihood, most commonly an isotropic Gaussian, to model the data conditional on latent variables. While convenient for optimization, this choice often leads to likelihood misspecification, resulting in blurry reconstructions and poor data fidelity, especially for high-dimensional data such as images. In this work, we propose EnVAE, a novel likelihood-free generative framework that has a deterministic decoder and employs the energy score--a proper scoring rule--to build the reconstruction loss. This enables likelihood-free inference without requiring explicit parametric density functions. To address the computational inefficiency of the energy score, we introduce a fast variant, FEnVAE, based on the local smoothness of the decoder and the sharpness of the posterior distribution of latent variables. This yields an efficient single-sample training objective that integrates seamlessly into existing VAE pipelines with minimal overhead. Empirical results on standard benchmarks demonstrate that EnVAE achieves superior reconstruction and generation quality compared to likelihood-based baselines. Our framework offers a general, scalable, and statistically principled alternative for flexible and nonparametric distribution learning in generative modeling.

Summary

Likelihood-Free Variational Autoencoders

The paper "Likelihood-Free Variational Autoencoders" by Chen Xu, Qiang Wang, and Lijun Sun presents a novel approach to enhance the generative capabilities of Variational Autoencoders (VAEs) by eliminating the traditional likelihood-based decoder. The authors propose the EnVAE framework, which introduces a deterministic decoder and utilizes the energy score, a proper scoring rule, as a replacement for the likelihood function. This approach addresses the limitations of VAEs in handling high-dimensional data, like images, by avoiding likelihood misspecification that often results in suboptimal generation quality characterized by blurriness and smoothed reconstructions.

Key Contributions

EnVAE Framework: The paper introduces EnVAE, a VAE model that employs a deterministic decoder trained on the basis of the energy score. This likelihood-free approach alleviates the structural inflexibility associated with predefined likelihoods, especially the common assumption of isotropic Gaussian distributions in traditional VAE implementations. By leveraging proper scoring rules for the reconstruction loss, EnVAE circumvents the issues of pixel-wise independence and unimodal distribution assumptions inherent in Gaussian decoders.
FEnVAE Variant: Recognizing the computational inefficiencies tied to energy score calculations, the authors propose FEnVAE—an efficient single-sample variant that approximates the energy score using local smoothness techniques and latent variable sharpness. This variant achieves significant computational savings by avoiding the need for multiple sampling steps, maintaining the benefits of the EnVAE framework with lower training time and resource requirements.
Empirical Validation: The authors perform extensive empirical evaluations on standard benchmarks, demonstrating that EnVAE and FEnVAE outperform traditional likelihood-based VAE models in terms of reconstruction fidelity, generation quality, and uncertainty quantification. The models provide notable improvements without the complexity and instability of adversarial training often seen in GAN-based frameworks.

Implications and Future Directions

The likelihood-free approach proposed by the authors introduces a flexible alternative for generative modeling, which can be especially valuable in tasks requiring high-quality image synthesis and complex data distributions. By eliminating explicit likelihood requirements, the framework opens up new possibilities for training VAEs in domains where traditional pixel-wise reconstruction losses are inadequate.

The development of EnVAE and FEnVAE suggests several promising directions for future research. The integration of proper scoring rule-based objectives into generative models may be expanded to other architectures beyond VAEs, including flow-based models and hybrid systems. Moreover, exploring further computational optimizations or alternative proper scoring rules could enhance the scalability and applicability of likelihood-free generative modeling to large-scale datasets.

In summary, this paper contributes to the advancement of generative modeling frameworks by proposing an innovative methodology that harmonizes the benefits of deterministic decoders and proper scoring rules. Its practical and theoretical significance provides valuable insights for researchers engaged in the improvement of generative performance within the domain of deep learning.

Related Papers

Tweets

https://twitter.com/StatMLPapers/status/1915617749193183698