Skip to main content
QUICK REVIEW

[Paper Review] Isolating Sources of Disentanglement in Variational Autoencoders

Ricky T. Q. Chen, Xuechen Li|arXiv (Cornell University)|Feb 14, 2018
Generative Adversarial Networks and Image Synthesis46 references145 citations
TL;DR

The paper decomposes the ELBO to isolate a total-correlation term, introduces beta-TCVAE as a plug-in improvement over beta-VAE with no extra hyperparameters, and proposes MIG, a classifier-free disentanglement metric. It empirically links total correlation to disentanglement across datasets.

ABSTRACT

We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate our $β$-TCVAE (Total Correlation Variational Autoencoder), a refinement of the state-of-the-art $β$-VAE objective for learning disentangled representations, requiring no additional hyperparameters during training. We further propose a principled classifier-free measure of disentanglement called the mutual information gap (MIG). We perform extensive quantitative and qualitative experiments, in both restricted and non-restricted settings, and show a strong relation between total correlation and disentanglement, when the latent variables model is trained using our framework.

Motivation & Objective

  • Motivate and quantify disentanglement in VAEs by decomposing the ELBO to identify the total correlation term.
  • Propose a training method that weights decomposition terms without introducing new hyperparameters.
  • Introduce beta-TCVAE as a plug-in replacement for beta-VAE with automatic disentanglement benefits.
  • Propose a classifier-free, information-theoretic metric (MIG) to evaluate disentanglement across latent distributions.

Proposed method

  • Derive an ELBO decomposition revealing index-code MI, total correlation, and dimension-wise KL terms.
  • Propose minibatch-weighted sampling to estimate decomposition terms without extra hyperparameters.
  • Define beta-TCVAE as a special case with alpha=gamma=1 and beta controlling TC penalty.
  • Provide an alternative training approach to estimate TC without a discriminator.

Experimental results

Research questions

  • RQ1Does penalizing the total correlation term in the ELBO promote disentanglement in VAEs?
  • RQ2Can beta-TCVAE achieve better disentanglement than beta-VAE without adding training complexity?
  • RQ3Is there a robust, classifier-free metric to quantify disentanglement across latent distributions?
  • RQ4How does total correlation correlate with disentanglement across datasets and sampling biases?

Key findings

  • beta-TCVAE yields more interpretable disentangled representations than beta-VAE in several datasets.
  • Total correlation correlates negatively with disentanglement under beta-TCVAE, supporting the TC penalty's role.
  • MIG provides a classifier-free, axis-aligned, generalizable disentanglement measure applicable to various latent distributions.
  • The proposed minibatch weighting allows training with TC weighting without additional hyperparameters.
  • FactorVAE, which is similar in objective, can be outperformed when density ratio tricks are hard to train, highlighting beta-TCVAE robustness.
  • beta-TCVAE remains effective even under non-uniform or dependent factor sampling, improving interpretability over baselines.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.