Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference (1912.01909v5)

Published 4 Dec 2019 in stat.ML and cs.LG

Abstract: This paper studies the fundamental problem of learning deep generative models that consist of multiple layers of latent variables organized in top-down architectures. Such models have high expressivity and allow for learning hierarchical representations. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference typically requires Markov chain Monte Caro (MCMC) that can be time consuming. In this paper, we propose to use noise initialized non-persistent short run MCMC, such as finite step Langevin dynamics initialized from the prior distribution of the latent variables, as an approximate inference engine, where the step size of the Langevin dynamics is variationally optimized by minimizing the Kullback-Leibler divergence between the distribution produced by the short run MCMC and the posterior distribution. Our experiments show that the proposed method outperforms variational auto-encoder (VAE) in terms of reconstruction error and synthesis quality. The advantage of the proposed method is that it is simple and automatic without the need to design an inference model.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Erik Nijkamp (22 papers)
  2. Bo Pang (77 papers)
  3. Tian Han (37 papers)
  4. Linqi Zhou (20 papers)
  5. Song-Chun Zhu (216 papers)
  6. Ying Nian Wu (138 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.