QUICK REVIEW

[論文レビュー] Inference Suboptimality in Variational Autoencoders

Chris Cremer, Xuechen Li|arXiv (Cornell University)|Jan 10, 2018

Generative Adversarial Networks and Image Synthesis参考文献 29被引用数 85

ひとこと要約

この論文は VAE における推論のサブ最適性を近似ギャップと償却ギャップに分解し、償却がしばしば支配的で、より表現力のある近似が償却誤差を減らし一般化を助ける一方、生成モデルは選択された近似に適応することを示している。

ABSTRACT

Amortized inference allows latent-variable models trained via variational learning to scale to large datasets. The quality of approximate inference is determined by two factors: a) the capacity of the variational distribution to match the true posterior and b) the ability of the recognition network to produce good variational parameters for each datapoint. We examine approximate inference in variational autoencoders in terms of these factors. We find that divergence from the true posterior is often due to imperfect recognition networks, rather than the limited complexity of the approximating distribution. We show that this is due partly to the generator learning to accommodate the choice of approximation. Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.

研究の動機と目的

Investigate the sources of mismatch between true and approximate posteriors in VAEs, focusing on approximation vs. amortization gaps.
Quantify how encoder choice, posterior expressiveness, decoder capacity, and optimization influence inference suboptimality.
Demonstrate that the learned generator adapts to the chosen inference approximation.
Evaluate how expressive approximations (flows, auxiliary variables) affect inference and generalization on standard datasets.

提案手法

Define and decompose the inference gap into approximation and amortization components for VAEs.
Experiment with expressive approximate posteriors including normalizing flows and auxiliary variables (q_Flow, q_AF).
Train and evaluate models on MNIST, Fashion-MNIST, and CIFAR-10, using ELBO, IWAE, and AIS bounds to estimate log p(x) and gaps.
Compute local (per-datapoint) optimization of q to obtain q*, comparing to amortized q.
Use entropy/temperature annealing to study effects on posterior utilization and inference gaps.
Assess the impact of encoder/decoder capacity and optimization choices on the gaps.

実験結果

リサーチクエスチョン

RQ1What factors cause the gap between the marginal log-likelihood and the ELBO in VAEs?
RQ2How do approximation gap and amortization gap contribute to overall inference suboptimality across datasets?
RQ3Does increasing approximate posterior expressiveness reduce the amortization gap or mainly the approximation gap?
RQ4How does encoder capacity vs. variational expressiveness influence generalization to held-out data?

主な発見

Amortization gap often dominates the total inference gap across datasets, sometimes more than the approximation gap.
The generator can adapt to the chosen approximation, reducing the approximation gap as expressiveness increases.
Expressive approximations (flows) also reduce the amortization gap, not merely the approximation gap, due to increased encoder capacity affecting inference.
Larger encoder capacity reduces the amortization error, but can risk encoder overfitting and reduced generalization.
Entropy/entropy-annealing during training helps the model utilize the flexibility of expressive posterior approximations.
Increasing decoder capacity reduces the approximation gap, suggesting generator flexibility can ease inference for less expressive encoders.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。