QUICK REVIEW

[论文解读] Improving Variational Inference with Inverse Autoregressive Flow

Diederik P. Kingma, Tim Salimans|arXiv (Cornell University)|Jun 15, 2016

Generative Adversarial Networks and Image Synthesis参考文献 33被引用 185

一句话总结

引入逆自回归流（IAF），一种用于高维潜在空间的可扩展规范化流，显著改进变分自编码器中的变分后验分布，并在 CIFAR-10 上以更快的采样实现具有竞争力的对数似然。

ABSTRACT

The framework of normalizing flows provides a general strategy for flexible variational inference of posteriors over latent variables. We propose a new type of normalizing flow, inverse autoregressive flow (IAF), that, in contrast to earlier published flows, scales well to high-dimensional latent spaces. The proposed flow consists of a chain of invertible transformations, where each transformation is based on an autoregressive neural network. In experiments, we show that IAF significantly improves upon diagonal Gaussian approximate posteriors. In addition, we demonstrate that a novel type of variational autoencoder, coupled with IAF, is competitive with neural autoregressive models in terms of attained log-likelihood on natural images, while allowing significantly faster synthesis.

研究动机与目标

动机并解决变分推断中简单的因子化后验的局限性。
引入一种适用于高维潜在空间的可扩展规范化流。
展示后验更灵活性提升与更紧密的变分界。
展示在真实图像数据集上深度 VAE 架构的性能提升。

提出的方法

提出逆自回归流（IAF），其中 z0 来自简单分布并通过一系列自回归、可逆的步骤进行变换。
每一步 zt = μt + σt ⊙ zt−1，利用自回归网络生成 μt、σt，并得到可处理的对数行列式。
提供一种数值稳定的变体，使用受 LSTM 更新启发的忘记门偏置。
使用基于 PixelCNN 的自回归网络（MADE 变体）来处理高维潜在变量。
允许在步骤之间翻转变量的顺序以保持体积；推导闭式对数行列式（等于 −log σi 的和）。
在 MNIST 和 CIFAR-10 上将 IAF 作为深度 VAE 的表达后验进行评估，并与对角高斯及其他流模型进行比较。

实验结果

研究问题

RQ1逆自回归流是否能为高维潜在空间提供可扩展、灵活的后验近似？
RQ2IAF 如何影响标准图像数据集上的变分下界的紧密性和达到的对数似然？
RQ3使用 IAF 的 VAE 相较于像 PixelCNN 这样的自回归生成模型，其采样速度如何？
RQ4堆叠多个 IAF 变换并使用自回归网络如何影响在 MNIST 和 CIFAR-10 上的性能？
RQ5IAF 是否能够在保持高效采样的同时实现具有竞争力的对数似然结果？

主要发现

模型	VLB	对数 p(x)
Diagonal covariance	-84.08 (± 0.10)	-81.08 (± 0.08)
IAF (Depth = 2, Width = 320)	-82.02 (± 0.08)	-79.77 (± 0.06)
IAF (Depth = 2, Width = 1920)	-81.17 (± 0.08)	-79.30 (± 0.08)
IAF (Depth = 4, Width = 1920)	-80.93 (± 0.09)	-79.17 (± 0.08)
IAF (Depth = 8, Width = 1920)	-80.80 (± 0.07)	-79.10 (± 0.07)

IAF 在变分自编码器中显著优于对角高斯后验。
更深更宽的 IAF 后验在 MNIST 上产生更紧的变分界和更好的对数似然。
在 CIFAR-10 上，带 IAF 的 ResNet VAE 实现 3.11 bits per dimension，与最先进的潜变量模型竞争，并且比基于 PixelCNN 的模型采样快得多。
在 Titan X 上，ResNet VAE + IAF 的每张图像采样约 0.05 秒，而基于 PixelCNN 的采样为 52 秒。
使用多层自回归后验相比固定的对角后验，可以显著收紧界限并提升生成建模性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。