QUICK REVIEW

[论文解读] Importance Weighted Autoencoders

Yuri Burda|arXiv (Cornell University)|Sep 1, 2015

Generative Adversarial Networks and Image Synthesis被引用 249

一句话总结

该论文提出了重要性加权自编码器（IWAE），一种生成模型，通过使用重要性加权技术来收紧对数似然下界，从而在变分自编码器（VAE）的基础上实现改进。通过从近似后验中采样多个潜在代码，并根据其似然度进行加权，IWAE能够学习到更丰富、更具表现力的潜在表征，并在相同架构下显著提升测试对数似然值。

ABSTRACT

The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It typically makes strong assumptions about posterior inference, for instance that the posterior distribution is approximately factorial, and that its parameters can be approximated with nonlinear regression from the observations. As we show empirically, the VAE objective can lead to overly simplified representations which fail to use the network's entire modeling capacity. We present the importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting. In the IWAE, the recognition network uses multiple samples to approximate the posterior, giving it increased flexibility to model complex posteriors which do not fit the VAE modeling assumptions. We show empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log-likelihood on density estimation benchmarks.

研究动机与目标

为解决VAE在学习丰富、高容量潜在表征方面因后验假设过于严格而产生的局限性。
通过收紧证据下界（ELBO），提升深度生成模型中变分推断的表达能力。
证明多样本重要性加权可实现对复杂后验的更好建模，优于标准VAE。
通过实证验证，IWAE学习到的潜在维度更活跃，且在密度估计基准测试中获得更高的对数似然值。

提出的方法

IWAE使用编码网络从近似后验 q(h|x) 中生成 K 个独立的潜在样本。
每个样本根据 p(x|h)/q(h|x) 的比率分配重要性权重，形成似然值的加权平均。
目标函数为 K 个样本加权平均的对数期望值，其构成的下界比标准 VAE 的 ELBO 更紧，更接近真实对数似然。
模型通过反向传播端到端训练，使用重要性加权目标函数替代标准 VAE 的损失函数。
该方法保持与 VAE 相同的网络架构，但通过多组样本使编码网络能够建模复杂、非因子化的后验分布。
随着 K 增大，IWAE 的下界收敛至真实对数似然，提供比 VAE 目标函数更精确的近似。

实验结果

研究问题

RQ1更紧致的对数似然下界是否能提升变分自编码器中潜在表征的表达能力？
RQ2在后验近似中使用多组样本是否能比标准 VAE 更好地建模复杂、非因子化的后验分布？
RQ3与 VAE 相比，IWAE 目标函数在多大程度上减少了潜在维度的不活跃现象？
RQ4在密度估计基准测试中，IWAE 的生成性能与 VAE 相比如何，特别是在测试对数似然方面？
RQ5VAE 中潜在维度的不活跃是由于优化问题，还是目标函数固有的局限性所致？

主要发现

IWAE 在测试对数似然值上显著优于 VAE，在 MNIST 和 CIFAR-10 基准上提升最高达 2.5 nats。
在 MNIST 数据集上，k=50 的 IWAE 模型测试对数似然达到 84.88 nats，而最佳 VAE 模型为 86.76 nats，表明生成性能更优。
IWAE 中活跃潜在维度的数量始终高于 VAE，最佳 IWAE 模型有 25 个活跃单元，而最佳 VAE 模型仅有 19 个。
当使用相反的目标函数重新训练时，采用 IWAE 目标函数训练的 VAE 不仅提升了活跃维度数量，也提高了对数似然值；而采用 VAE 目标函数训练的 IWAE 在两项指标上均出现退化。
在双层模型中，第二层的活跃维度数量始终低于 10 个，表明深层架构中容量利用有限。
移除不活跃维度对测试对数似然值的影响可忽略不计（小于 0.06 nats），证实其对生成性能贡献极小。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。