QUICK REVIEW

[论文解读] InfoVAE: Information Maximizing Variational Autoencoders

Shengjia Zhao, Jiaming Song|arXiv (Cornell University)|Jun 7, 2017

Generative Adversarial Networks and Image Synthesis参考文献 37被引用 371

一句话总结

InfoVAE通过增加一个可扩展的KL项和一个互信息项来推广VAE目标，使得近似推断和潜在表示利用更好，MMD基础的散度在经验上表现强劲。

ABSTRACT

A key advance in learning generative models is the use of amortized inference distributions that are jointly trained with the models. We find that existing training objectives for variational autoencoders can lead to inaccurate amortized inference distributions and, in some cases, improving the objective provably degrades the inference quality. In addition, it has been observed that variational autoencoders tend to ignore the latent variables when combined with a decoding distribution that is too flexible. We again identify the cause in existing training criteria and propose a new class of objectives (InfoVAE) that mitigate these problems. We show that our model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution. Through extensive qualitative and quantitative analyses, we demonstrate that our models outperform competing approaches on multiple performance metrics.

研究动机与目标

动机并诊断标准ELBO在VAEs学习与推断中的失败。
提出一个广义目标，明确在数据重构、潜在正则化和信息使用之间进行权衡。
提供在不同模型家族中平衡X空间和Z空间损失的实际实现与指南。
证明所提出的InfoVAE框架在跨数据集和解码器上改善了摊销推断和潜在表示利用。

提出的方法

引入InfoVAE目标，在D_KL(q(z)||p(z))中加入一个缩放因子lambda。
加入互信息项I_q(x;z)，以促进具有信息性的潜在表示。
将目标改写为等价的、便于优化的形式（包含重构项、带权重的KL(q(z|x)||p(z))，以及带权重的KL(q(z)||p(z))）。
允许将D_KL(q(z)||p(z))替换为任何严格散度D(q(z)||p(z))（例如MMD、Stein、对抗性）在某些条件下仍保持最优性。
展示与beta-VAE和对抗自编码器（AAE）的联系与特例。
评估散度（对抗、Stein、MMD），并报告MMD正则化的InfoVAE在多数指标上往往表现最好。

实验结果

研究问题

RQ1InfoVAE是否可以缓解标准ELBO观察到的摊销推断失败？
RQ2是否通过对信息流(I_q(x;z))的显式控制以及平衡的X/Z损失来改善潜在表示的利用和泛化？
RQ3在实践中，哪种散度族（MMD、Stein、对抗）最适合支持InfoVAE目标？
RQ4与基于ELBO的VAE、beta-VAE和AAE相比，InfoVAE变体在重构、似然性和半监督任务上的表现如何？

主要发现

模型	对数似然估计
ELBO	82.75
MMD-VAE	80.76
Stein-VAE	81.47
Adversarial VAE	82.21

ELBO优化会导致摊销推断不准确和过拟合；InfoVAE通过平衡X和Z损失并鼓励潜在使用来缓解这一问题。
带有MMD正则化的InfoVAE（λ较大，α≈1，在某些设置中α=1）在各项指标上实现更好或可比的对数似然性和样本质量。
InfoVAE在解码器高度灵活时仍保持有意义的潜在表示，避免信息偏好问题。
在MNIST上的经验结果表明，带MMD的InfoVAE提供稳定的训练、良好的后验近似和强大的半监督性能；ELBO倾向于高估q(z)的方差。
表1显示对数似然估计：ELBO 82.75、MMD-VAE 80.76、Stein-VAE 81.47、Adversarial VAE 82.21（在该指标中越高越好）。
InfoVAE变体在对数似然、采样质量和半监督性能等多项指标上通常优于竞争方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。