HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

Junxian He; Daniel Spokoyny; Graham Neubig; Taylor Berg-Kirkpatrick

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

Abstract

The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. By using a neural inference network to approximate the model's posterior on latent variables, VAEs efficiently parameterize a lower bound on marginal data likelihood that can be optimized directly via gradient methods. In practice, however, VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and the approximate posterior mimics the prior. In this paper, we investigate posterior collapse from the perspective of training dynamics. We find that during the initial stages of training the inference network fails to approximate the model's true posterior, which is a moving target. As a result, the model is encouraged to ignore the latent encoding and posterior collapse occurs. Based on this observation, we propose an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, we aggressively optimize the inference network before performing each model update. Despite introducing neither new model components nor significant complexity over basic VAE, our approach is able to avoid the problem of collapse that has plagued a large amount of previous work. Empirically, our approach outperforms strong autoregressive baselines on text and image benchmarks in terms of held-out likelihood, and is competitive with more complex techniques for avoiding collapse while being substantially faster.

Code Repositories

jxhe/vae-lagging-encoder
Official
pytorch
Mentioned in GitHub
spokoyny/spokoyny.github.io
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-generation-on-yahoo-questionsAggressive VAE
KL: 5.7
NLL: 326.7
Perplexity: 59.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders | Papers | HyperAI