QUICK REVIEW

[論文レビュー] InfoVAE: Information Maximizing Variational Autoencoders

Shengjia Zhao, Jiaming Song|arXiv (Cornell University)|Jun 7, 2017

Generative Adversarial Networks and Image Synthesis参考文献 37被引用数 371

ひとこと要約

InfoVAEは、スケーラブルなKL項と相互情報項を追加することでVAEの目的を一般化し、より良いアモルタイズド推論と潜在利用を可能にする。MMDベースの発散が強い実証性能をもたらす。

ABSTRACT

A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (InfoVAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metrics.

研究の動機と目的

VAEsの学習と推論における標準ELBOの失敗を動機づけ、診断する。
データ再構成、潜在正則化、情報利用のトレードオフを明示的に行う一般化された目的を提案する。
モデルファミリ全体でX-spaceとZ-spaceの損失のバランスを取る実践的な具体例と指針を提供する。
提案されたInfoVAEフレームワークが、データセットとデコーダを跨いでアモルタイズド推論と潜在利用を改善することを示す。

提案手法

D_KL(q(z)||p(z))にスケーリング因子λを加えるInfoVAE目的を導入する。
有意義な潜在表現を促進するためにI_q(x;z)という相互情報項を追加する。
再構成項、重み付きKL(q(z|x)||p(z))、重み付きKL(q(z)||p(z))を含む、同等で最適化に適した形に目的を再構成する。
D_KL(q(z)||p(z))を、特定の条件下で最適性を保ったまま任意の厳密な発散D(q(z)||p(z))（例：MMD、Stein、対抗的）に置換できるようにする。
beta-VAEおよびAdversarial Autoencoders (AAE)との関係と特殊例を示す。
発散（Adversarial、Stein、MMD）を評価し、MMD正則化されたInfoVAEが指標全般でしばしば最良の性能を示すと報告する。

Figure 1 : Verification of Proposition 1 where the dataset only contains two examples $\{-1,1\}$ . Top: density of the distributions $q_{\phi}(z|x)$ when $x=1$ (red) and $x=-1$ (green) compared with the true prior $p(z)$ (purple). Bottom: The “reconstruction” $p_{\theta}(x|z)$ when $z$ is sampled fr

実験結果

リサーチクエスチョン

RQ1InfoVAEは標準ELBOで観察されるアモルタイズド推論の失敗を緩和できるか？
RQ2情報の流れの明示的な制御（I_q(x;z)）とX/Z損失のバランスが潜在利用と一般化を改善するか？
RQ3実践的にInfoVAE目的を最もよく支える発散ファミリはどれか（MMD、Stein、対抗的）？
RQ4ELBOベースのVAE、beta-VAE、AAEと比較して、再構成、尤度、半教師付きタスクにおけるInfoVAEの変種の性能はどうか？

主な発見

Model	Log likelihood estimate
ELBO	82.75
MMD-VAE	80.76
Stein-VAE	81.47
Adversarial VAE	82.21

ELBOの最適化は不正確なアモルタイズド推論と過学習につながる可能性がある；InfoVAEはXとZ損失のバランスを取り、潜在利用を促進することでこれを緩和する。
MMD正則化を用いたInfoVAE（λが大きい、α ≈ 1、設定によってはα=1）は、指標を跨いでより良いまたは同等の対数尤度とサンプル品質を達成する。
InfoVAEは高度に柔軟なデコーダでも意味のある潜在表現を維持し、情報好みの問題を回避する。
MNISTの経験的結果は、MMDを用いたInfoVAEが安定したトレーニング、良好な後方推定近似、強力な半教師付き性能を提供することを示す；ELBOはしばしばq(z)の分散を過大推定する傾向がある。
表1は対数尤度の推定を示す：ELBO 82.75、MMD-VAE 80.76、Stein-VAE 81.47、Adversarial VAE 82.21（この指標では高いほど良い）。
InfoVAEの変種は、対数尤度、サンプリング品質、半教師付き性能など複数の指標において、競合手法を一般的に上回る。

Figure 2 : $\log\det(\mathrm{Cov}[q_{\phi}(z)])$ for ELBO vs. MMD-VAE under different training set sizes. The correct prior $p(z)$ has value $0$ on this metric, and values above or below $0$ correspond to over-estimation and under-estimation of the variance respectively. ELBO (blue curve) shows cons

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。