QUICK REVIEW

[论文解读] Disentangling Disentanglement in Variational Autoencoders

Émile Mathieu, Tom Rainforth|arXiv (Cornell University)|Dec 6, 2018

Adversarial Robustness in Machine Learning参考文献 55被引用 52

一句话总结

该论文将 VAE 的解缠泛化为由两个因素支配的潜在空间分解——潜在空间中的重叠以及聚合编码与结构化先验的一致性——并展示先验选择和一个带有 alpha/beta 的新目标函数如何在超越简单独立性的基础上，产生更丰富、可定制的表示。

ABSTRACT

We develop a generalisation of disentanglement in VAEs---decomposition of the latent representation---characterising it as the fulfilment of two factors: a) the latent encodings of the data having an appropriate level of overlap, and b) the aggregate encoding of the data conforming to a desired structure, represented through the prior. Decomposition permits disentanglement, i.e. explicit independence between latents, as a special case, but also allows for a much richer class of properties to be imposed on the learnt representation, such as sparsity, clustering, independent subspaces, or even intricate hierarchical dependency relationships. We show that the $β$-VAE varies from the standard VAE predominantly in its control of latent overlap and that for the standard choice of an isotropic Gaussian prior, its objective is invariant to rotations of the latent representation. Viewed from the decomposition perspective, breaking this invariance with simple manipulations of the prior can yield better disentanglement with little or no detriment to reconstructions. We further demonstrate how other choices of prior can assist in producing different decompositions and introduce an alternative training objective that allows the control of both decomposition factors in a principled manner.

研究动机与目标

以潜在空间分解的两大因素：重叠与先验结构对齐，来激发对解缠的普遍概念。
展示标准解缠定义在复杂数据上的局限性，并展示一个更灵活的分解框架。
分析 beta-VAE 以了解其如何控制潜在重叠以及先验选择如何影响解缠。
提出一个替代目标函数，显式对两个分解因素进行正则化，以实现结构化表示（如稀疏化、聚类）。

提出的方法

将 VAE 的分解定义为满足两个因素：适当的潜在重叠以及聚合编码与先验的一致性。
将 beta-VAE 与带退火先验和解码端重构项的调整后的 ELBO 以及对编码器的最大熵正则化联系起来。
给出理论结果，表明在高斯情形下，beta-VAE 等价于对标准 ELBO 的潜在空间缩放和先验退火的形式。
引入目标 L_{alpha,beta}，在其中加入 q(z) 与 p(z) 之间的散度项，以控制第二个分解因素。
在各向异性和非高斯先验下进行实验，以研究轴向对齐的解缠、聚类和稀疏性。
给出一个面向稀疏性的先验并用稀疏性指标和重构性能进行评估。

实验结果

研究问题

RQ1如何将解缠泛化到超越独立性的水平，以适应复杂数据生成过程？
RQ2潜在重叠（I(x; z)）以及聚合潜在编码与先验之间的匹配在实现有用的潜在分解中起到什么作用？
RQ3改变先验结构并在 q(z) 上引入显式正则化是否可以改善解缠并实现如稀疏化或聚类等替代分解？
RQ4beta-VAE 如何与这两个因素分解相关，我们是否可以改写目标以独立控制这两个因素？
RQ5非各向同性先验或为稀疏/聚类设计的先验是否在不牺牲重构的情况下带来解缠的实际收益？

主要发现

beta-VAE 主要通过对编码器施加最大熵效应来控制潜在重叠，其收益受限于各向同性高斯先验下先验对旋转的不变性。
在高斯先验和编码器下，beta-VAE 相当于对标准 ELBO 的潜在空间缩放优化，常数项除外。
各向同性高斯先验的旋转不变性可能阻碍解缠，而用结构化先验打破这一不变性，在固定重构质量下提升解缠。
通过 alpha 显式正则化聚合后验以匹配结构化先验，可以改善 q(z) 与 p(z) 的对齐，并实现如聚类或稀疏性等替代分解。
实验显示，非各向同性先验（如各向异性高斯或 Student-t 混合）在类似重构性能下可获得更好的解缠分数；促进稀疏性或聚类的先验在 Fashion-MNIST 和合成数据集上实现了这一点。
一个同时包含两个分解因素的目标（beta 控制重叠，alpha 控制先验对齐）使得学习稀疏与聚簇的潜在表示成为可能，同时对重构的影响不过于严重。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。