QUICK REVIEW

[论文解读] VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models

Zhisheng Xiao, Karsten Kreis|arXiv (Cornell University)|Oct 1, 2020

Generative Adversarial Networks and Image Synthesis参考文献 84被引用 36

一句话总结

VAEBM 将 VAE 生成器与数据空间中的能量基模型结合在一起，分两阶段训练，以实现高质量图像生成、快速采样和良好的模式覆盖。

ABSTRACT

Energy-based models (EBMs) have recently been successful in representing complex distributions of small images. However, sampling from them requires expensive Markov chain Monte Carlo (MCMC) iterations that mix slowly in high dimensional pixel space. Unlike EBMs, variational autoencoders (VAEs) generate samples quickly and are equipped with a latent space that enables fast traversal of the data manifold. However, VAEs tend to assign high probability density to regions in data space outside the actual data distribution and often fail at generating sharp images. In this paper, we propose VAEBM, a symbiotic composition of a VAE and an EBM that offers the best of both worlds. VAEBM captures the overall mode structure of the data distribution using a state-of-the-art VAE and it relies on its EBM component to explicitly exclude non-data-like regions from the model and refine the image samples. Moreover, the VAE component in VAEBM allows us to speed up MCMC updates by reparameterizing them in the VAE's latent space. Our experimental results show that VAEBM outperforms state-of-the-art VAEs and EBMs in generative quality on several benchmark image datasets by a large margin. It can generate high-quality images as large as 256$\ imes$256 pixels with short MCMC chains. We also demonstrate that VAEBM provides complete mode coverage and performs well in out-of-distribution detection. The source code is available at https://github.com/NVlabs/VAEBM

研究动机与目标

将 VAEs 和 EBMs 相结合以利用它们的互补优势作为动机。
用 VAE 捕捉数据分布的模态结构，同时使用 EBM 来细化细节并排除非数据区域。
通过在 VAE 潜在空间中重新参数化 MCMC 来实现更快的采样。
提供一个分为两阶段的训练程序，以提高实用性和稳定性。
在多个图像基准上展示生成质量和模态覆盖的改进。

提出的方法

将生成模型定义为 h_{ψ,θ}(x,z) = (1/Z_{ψ,θ}) p_{θ}(x,z) e^{-E_{ψ}(x)}，其中 p_{θ}(x,z) 是 VAE 生成器，E_{ψ}(x) 是像素空间中的能量函数。
通过最大化边际对数似然来训练，分解为 L_VAE 和 L_EBM 两项，从而实现两阶段优化（先训练 VAE，固定 θ，再训练 EBM ψ）。
使用重参数化从联合空间 (x,z) 和扩展空间 ε = (ε_x, ε_z) 采样，从而在采样中实现高效的 Langevin 动力学。
在负向相中，在联合 (z,x) 空间中进行带有重参数化采样的 MCMC 以加速混合。
证明通过在 VAE 的潜在空间以及通过扩展模型在数据空间联合进行 MCMC 可以加速采样。

实验结果

研究问题

RQ1将 VAE 与 EBM 的整合是否相对于纯 VAE 或纯 EBM 方法在样本质量上有所提升？
RQ2两阶段训练（先 VAE、后 EBM）是否能实现稳定的优化和实用的 VAEBM 采样？
RQ3VAEBM 是否在图像数据集上实现完整的模态覆盖和鲁棒的异常检测？
RQ4与最先进的基于似然的方法以及 GAN/基于分数的方法在标准基准上相比，VAEBM 的表现如何？

主要发现

VAEBM 在 CIFAR-10 及其他基准上在基于似然的评估中优于之前的 EBMs 和最先进的 VAE。
从预训练 VAE 初始化的短链 MCMC 产生高质量样本，且采样速度更快。
VAEBM 在保持基于似然的训练优势的同时，与 GAN 和基于分数的模型相比表现具有竞争力或更优。
该模型展示了完整的模态覆盖和较强的异常检测能力（AUROC 高于若干基线）。
在 CelebA 64、CelebA HQ 256 和 LSUN Church 64 上，VAEBM 相较于 NVAE 及相关基线在 FID 分数上有显著改进。
在 2D 事务性实验（25-Gaussians）中，VAEBM 提升了 VAE 的似然性并更好地匹配真实分布。）

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。