QUICK REVIEW

[论文解读] Avoiding Latent Variable Collapse With Generative Skip Models

Adji Bousso Dieng, Yoon Kim|arXiv (Cornell University)|Jul 12, 2018

Generative Adversarial Networks and Image Synthesis参考文献 32被引用 35

一句话总结

本文提出跳过变分自编码器（Skip-VAEs），通过在生成模型中引入跳跃连接，防止变分自编码器（VAEs）中的潜在变量崩溃。通过增强潜在变量与观测数据之间的强依赖关系，Skip-VAEs 提升了互信息，生成更具语义意义的表征，在 MNIST、Omniglot 和 Yahoo 文本数据集上，其表征质量优于标准 VAE，同时保持了相似的似然性能。

ABSTRACT

Variational autoencoders learn distributions of high-dimensional data. They model data with a deep latent-variable model and then fit the model by maximizing a lower bound of the log marginal likelihood. VAEs can capture complex distributions, but they can also suffer from an issue known as "latent variable collapse," especially if the likelihood model is powerful. Specifically, the lower bound involves an approximate posterior of the latent variables; this posterior "collapses" when it is set equal to the prior, i.e., when the approximate posterior is independent of the data. While VAEs learn good generative models, latent variable collapse prevents them from learning useful representations. In this paper, we propose a simple new way to avoid latent variable collapse by including skip connections in our generative model; these connections enforce strong links between the latent variables and the likelihood function. We study generative skip models both theoretically and empirically. Theoretically, we prove that skip models increase the mutual information between the observations and the inferred latent variables. Empirically, we study images (MNIST and Omniglot) and text (Yahoo). Compared to existing VAE architectures, we show that generative skip models maintain similar predictive performance but lead to less collapse and provide more meaningful representations of the data.

研究动机与目标

为解决 VAE 中潜在变量崩溃的问题，即后验分布坍塌至先验分布，无法捕捉有意义的数据表征。
通过加强潜在变量与观测数据之间的联系，提升 VAE 的表征能力。
证明在似然模型中引入跳跃连接可增强观测数据与推断潜在变量之间的互信息。
表明 Skip-VAEs 在保持高似然性能的同时减少崩溃现象，尤其在深层模型和高维潜在空间中表现更优。
评估跳跃连接与先进训练方法（如半摊销推理，sa-VAE）之间的协同效应。

提出的方法

引入跳跃连接，将潜在变量 z 与生成网络隐藏层的多层特征进行拼接。
构建生成跳跃模型，其中似然 pθ(x|z) 由带有残差式跳跃连接（从 z 到中间层）的深层网络参数化。
使用摊销变分推断进行训练，通过优化证据下界（ELBO）同时学习生成参数 θ 和推理网络参数 φ。
采用球面高斯先验 p(z) = N(0, I)，并使用深度神经网络参数化似然和后验分布。
将跳跃连接与半摊销推理（sa-VAE）结合，进一步提升后验质量并减少崩溃现象。
通过互信息、KL 散度、活跃单元分析以及 MNIST 和 Yahoo 文本数据上的下游分类准确率评估性能。

实验结果

研究问题

RQ1在生成模型中添加跳跃连接是否能减少 VAE 中的潜在变量崩溃？
RQ2跳跃连接在多大程度上提升了观测数据与推断潜在变量之间的互信息？
RQ3Skip-VAEs 在表征质量与似然性能方面与标准 VAE 和 sa-VAE 相比表现如何？
RQ4跳跃连接的优势是否随模型深度或潜在维度的增加而增强？
RQ5跳跃连接能否有效缓解自回归 VAE 在文本生成任务中的崩溃问题？

主要发现

在 MNIST 数据集上，使用后验均值作为特征时，Skip-VAE 的分类准确率达到 98.10%，而标准 VAE 为 97.19%。
在较弱模型（基于 MLP 的编码器/解码器）下，Skip-VAE 达到 98.25% 的准确率，而标准 VAE 为 97.70%。
在 Yahoo 文本数据集上，skip-sa-VAE 有效利用了全部 64 个潜在维度，而标准 sa-VAE 在高维下表现出互信息降低和活跃单元减少。
随着潜在维度增加，Skip-VAEs 保持或提升了互信息，同时降低了崩溃指标，而标准 VAE 的性能则随维度增加而恶化。
与仅使用 sa-VAE 相比，skip-sa-VAE 实现了更高的互信息和更优的崩溃缓解效果，表明跳跃连接与半摊销推理之间存在协同效应。
t-SNE 可视化结果表明，Skip-VAE 的潜在表示形成了更结构化、更具类别判别力的聚类，优于标准 VAE 的潜在表示。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。