Denoising Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors (2401.02739v4)

Published 5 Jan 2024 in cs.LG, q-bio.QM, and stat.ML

Abstract: We propose denoising diffusion variational inference (DDVI), a black-box variational inference algorithm for latent variable models which relies on diffusion models as flexible approximate posteriors. Specifically, our method introduces an expressive class of diffusion-based variational posteriors that perform iterative refinement in latent space; we train these posteriors with a novel regularized evidence lower bound (ELBO) on the marginal likelihood inspired by the wake-sleep algorithm. Our method is easy to implement (it fits a regularized extension of the ELBO), is compatible with black-box variational inference, and outperforms alternative classes of approximate posteriors based on normalizing flows or adversarial networks. We find that DDVI improves inference and learning in deep latent variable models across common benchmarks as well as on a motivating task in biology -- inferring latent ancestry from human genomes -- where it outperforms strong baselines on the Thousand Genomes dataset.

References (57)

Citations (1)

View on Semantic Scholar

Summary

The paper introduces DDVI, which integrates a diffusion process into variational inference to create more expressive posterior distributions.
It employs a wake-sleep framework with a novel regularization approach, resulting in improved performance in clustering and semi-supervised learning tasks.
Empirical results on human genetic data highlight its potential to advance latent variable modeling in generative tasks and complex decision-making.

Introduction

The field of latent variable models (LVMs) represents a powerful approach for modeling complex data in a lower-dimensional latent space, which enables tasks like dimensionality reduction, data visualization, and unsupervised learning. Variational inference (VI) is a prominent technique employed to approximate posterior distributions within these models. However, the expressivity of the approximate posterior has a substantial impact on VI's performance. The latest publication in this domain introduces denoising diffusion variational inference (DDVI), a method that seeks to enhance variational posteriors' expressivity utilizing diffusion models, resulting in a novel class of algorithms including the denoising diffusion VAE (DD-VAE).

Variational Inference with Denoising Diffusion Models

DDVI incorporates diffusion methods within the variational posterior, transforming a simple latent representation into a complex one through an iterative process. Inspired by the wake-sleep algorithm, DDVI introduces a wake phase with a reconstruction loss and a sleep phase with a novel form of regularization that encourages the posterior to match a user-specified noising process. The empirical results suggest that the DD-VAE, an instantiation of this framework, can outperform other methods, particularly in tasks where capturing semantically meaningful structures is crucial.

Semi-Supervised Learning and Clustering Applications

In semi-supervised learning scenarios, DDVI enhances latent variable models to accommodate observable labels in part of the data, allowing the model to be fitted with a mixture of labeled and unlabeled samples. When it comes to clustering, DDVI presents flexibility. One approach is to retain the model's original prior and introduce an additional cluster latent variable, with the other relying on partitioning the prior into a mixture that reflects different clusters. The method's applicability to the semi-supervised learning of human ancestry from genomes demonstrates its capability to grasp semantically rich structures from the data.

Discussion and Conclusion

DDVI and DD-VAE reflect significant steps towards more expressive variational posteriors that offer advantages over more rigid ones. Clustering and semi-supervised learning results on human genetic data underline the potential of diffusion-based encoders in latent variable models. The regularized nature of the learning objective and the ability to direct the posterior to follow intricate distributions hold promise for a variety of applications. Nonetheless, as with any method that departs from traditional approaches like the evidence lower bound, this novelty is accompanied by the need for careful attention to the choice of regularizers and additional parameters.

The advancement in latent variable modeling spearheaded by methods like DDVI continues to pave the way for extensive applications of generative models in complex decision-making and estimation tasks that hinge on solid inference foundations. While this paper focuses on dimensionality reduction and visualization, future work may extend the utility of DD-VAE to other domains, bolstered by regularized and expressive variational inference techniques.

PDF Markdown

Tweets

https://twitter.com/ElytraMithra/status/1833278748910833995

https://twitter.com/volokuleshov/status/1896072225159332242

https://twitter.com/fly51fly/status/1744472415139471440