QUICK REVIEW

[论文解读] Discriminative Embeddings of Latent Variable Models for Structured Data

Hanjun Dai, Bo Dai|arXiv (Cornell University)|Mar 17, 2016

Machine Learning and Data Classification参考文献 53被引用 373

一句话总结

本文提出 structure2vec，一种可扩展的潜变量图模型嵌入的判别式表示，用于结构化数据，利用受均场与循环信念传播启发的更新，端到端学习用于序列、树和图的分类与回归的表征。

ABSTRACT

Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced a number of interdisciplinary areas such as computational biology and drug design. Typically, kernels are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods from scaling up to millions of data points, and exploiting discriminative information to learn feature representations. We propose, structure2vec, an effective and scalable approach for structured data representation based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces using discriminative information. Interestingly, structure2vec extracts features by performing a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. In applications involving millions of data points, we showed that structure2vec runs 2 times faster, produces models which are $10,000$ times smaller, while at the same time achieving the state-of-the-art predictive performance.

研究动机与目标

激励在结构化数据上的可扩展学习，因为传统的 BOS 内核在数百万个点上扩展效果差。
提出一个判别式嵌入框架（structure2vec），将潜在变量后验嵌入到有限维特征空间。
开发受均场和循环信念传播启发的嵌入更新，可在监督下端到端学习。
证明该方法可产生紧凑模型，并在中等规模与非常大规模的结构化数据集上达到有竞争力的最先进准确率。

提出的方法

将每个结构化数据点建模为一个带有观测节点属性和隐藏变量的潜在变量图模型。
通过特征映射 phi，将后验边际 p(H_i | data)嵌入到有限维特征空间，得到嵌入 mu_i。
将嵌入更新表示为受均场或循环信念传播更新启发的神经网络风格非线性映射（如 mu_i = sigma(W1 x_i + W2 ∑_{j in N(i)} mu_j)）。
通过对比学习目标最小化端到端学习嵌入变换 T 和最终预测器（回归的平方损失，分类的带 softmax 的交叉熵）。
为了可扩展性使用随机梯度下降，以及一个小型显式特征映射，以避免大型核矩阵。

实验结果

研究问题

RQ1能否将结构化数据的潜在变量图模型嵌入到一个判别式、可训练的特征空间中，并扩展到数百万个实例？
RQ2受均场与循环信念传播启发的嵌入在结构化数据任务中是否能提供与固定 BOS 和 GM 内核相比具有竞争力的预测性能？
RQ3将嵌入和最终预测器进行端到端训练，是否能得到更小的模型且在中等和大规模数据集上达到可比或更高的精度？

主要发现

Structure2vec 的变体（DE-MF 和 DE-LBP）在字符串基准测试的 AUC 上超越前缀核基线。
在字符串数据集上，DE-MF 获得 0.7713 的 AUC（FC_RES），0.9068 的 AUC（SCOP），DE-LBP 在 SCOP 上达到 0.9167 的 AUC。
在图数据基准测试中，structure2vec 变体对多种图核（如子树核、随机游走核、WL 核）表现出具有竞争力的准确性。
该方法能够高效处理非常大的数据集，例如在哈佛清洁能源项目数据集（百万级样本）上的演示，训练更快、模型更小，同时保持具有竞争力的准确性。
嵌入的均场和循环信念传播更新被实现为神经网络风格的模块，支持端到端的判别式训练。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。