Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs (2209.12590v2)

Published 26 Sep 2022 in cs.LG

Abstract: In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning. However, training sequence VAEs is challenging: autoregressive decoders can often explain the data without utilizing the latent space, known as posterior collapse. To mitigate this, state-of-the-art models weaken the powerful decoder by applying uniformly random dropout to the decoder input. We show theoretically that this removes pointwise mutual information provided by the decoder input, which is compensated for by utilizing the latent space. We then propose an adversarial training strategy to achieve information-based stochastic dropout. Compared to uniform dropout on standard text benchmark datasets, our targeted approach increases both sequence modeling performance and the information captured in the latent space.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Đorđe Miladinović (6 papers)
  2. Kumar Shridhar (25 papers)
  3. Kushal Jain (6 papers)
  4. Max B. Paulus (9 papers)
  5. Joachim M. Buhmann (47 papers)
  6. Mrinmaya Sachan (124 papers)
  7. Carl Allen (16 papers)
Citations (5)