QUICK REVIEW

[论文解读] Autoencoding Variational Inference For Topic Models

Akash Srivastava, Charles Sutton|arXiv (Cornell University)|Mar 4, 2017

Topic Modeling被引用 144

一句话总结

本论文介绍 AVITM，一种用于隐主题Dirichlet分配的有效自编码变分贝叶斯方法，使用神经推断网络来近似后验，解决 Dirichlet先验和成分塌缩的问题；同时提出 ProdLDA，一种基于专家模型的主题模型，具有更好的一致性。

ABSTRACT

Topic models are one of the most popular methods for learning representations of text, but a major challenge is that any change to the topic model requires mathematically deriving a new inference algorithm. A promising approach to address this problem is autoencoding variational Bayes (AEVB), but it has proven diffi- cult to apply to topic models in practice. We present what is to our knowledge the first effective AEVB based inference method for latent Dirichlet allocation (LDA), which we call Autoencoded Variational Inference For Topic Model (AVITM). This model tackles the problems caused for AEVB by the Dirichlet prior and by component collapsing. We find that AVITM matches traditional methods in accuracy with much better inference time. Indeed, because of the inference network, we find that it is unnecessary to pay the computational cost of running variational optimization on test data. Because AVITM is black box, it is readily applied to new topic models. As a dramatic illustration of this, we present a new topic model called ProdLDA, that replaces the mixture model in LDA with a product of experts. By changing only one line of code from LDA, we find that ProdLDA yields much more interpretable topics, even if LDA is trained via collapsed Gibbs sampling.

研究动机与目标

推动并实现对主题模型的黑箱式、快速推断，而无需手工推导特定模型的更新。
克服在 LDA 的 AEVB 中 Dirichlet先验和成分塌缩的挑战。
证明推断网络在显著提升测试时性能的同时，能够达到传统推断的质量。
引入 ProdLDA，一种专家模型乘积的主题模型，在主题一致性方面优于 LDA。

提出的方法

开发 AVITM：使用推断网络对 q(θ,z|γ,φ) 参数化，并通过重参数化技巧优化 ELBO。
在 softmax 基底对 Dirichlet 先验进行拉普拉斯近似，以实现 θ 的高斯样本化式的重参数化。
使用塌缩表示法在 LDA 中对 z 求和，以使采样简化为仅 θ。
通过高动量的 Adam 优化、批量归一化、dropout 和 KL 项退火来缓解成分塌缩。
通过用专家乘积替换混合词模型来训练 ProdLDA，即 p(w_n|θ,β) ∝ ∏_k p(w_n|z_n=k,β)^{θ_k}。
通过使用神经推断网络将文档直接映射到主题比例，提供训练时和测试时的效率提升。

实验结果

研究问题

RQ1将 AVIB 方法有效应用于 LDA 吗，需解决 Dirichlet 先验和成分塌缩的问题？
RQ2推断网络是否能够在不进行测试时优化的情况下，为新文档实现快速、准确的后验推断？
RQ3相较于标准 LDA，ProdLDA 是否产生更好的主题一致性，以及在何种训练条件下？
RQ4在主题质量和速度方面，AVITM 与在线均场推断和塌缩 Gibbs 采样相比如何？
RQ5AVITM 能否作为一种黑箱推断方法，直接应用于新的主题模型？

主要发现

AVITM 产生的主题质量与标准均场推断相当，同时训练和测试时的性能显著更快。
推断网络可以在不运行变分优化的情况下为新文档估计主题比例，其困惑度与基于优化的方法相当。
ProdLDA 一直比标准 LDA 具有更好的主题一致性，即使 LDA 使用 Gibbs 采样训练时亦然。
AVITM 能在单个 GPU 上对大规模语料（约100万文档）在不到 80 分钟内完成训练，只需一行代码即可将 LDA 切换为 ProdLDA。
拉普拉斯近似的 Dirichlet 先验以及高动量训练结合批量归一化，有助于缓解成分塌缩并提升主题稀疏性和一致性。
ProdLDA 的更高主题稀疏性与改进的主题一致性相关，支持在神经主题模型中使用类 Dirichlet 的先验的好处。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。