QUICK REVIEW

[论文解读] subgraph2vec: Learning Distributed Representations of Rooted Sub-graphs from Large Graphs

A. Sankara Narayanan, Mahinthan Chandramohan|arXiv (Cornell University)|Jun 29, 2016

Advanced Graph Neural Networks参考文献 6被引用 139

一句话总结

subgraph2vec 在大规模图中学习无监督的有根子图分布式表示，并将它们与分类器或深度模型结合，以改进与图相关的任务，优于传统的图核。

ABSTRACT

In this paper, we present subgraph2vec, a novel approach for learning latent representations of rooted subgraphs from large graphs inspired by recent advancements in Deep Learning and Graph Kernels. These latent representations encode semantic substructure dependencies in a continuous vector space, which is easily exploited by statistical models for tasks such as graph classification, clustering, link prediction and community detection. subgraph2vec leverages on local information obtained from neighbourhoods of nodes to learn their latent representations in an unsupervised fashion. We demonstrate that subgraph vectors learnt by our approach could be used in conjunction with classifiers such as CNNs, SVMs and relational data clustering algorithms to achieve significantly superior accuracies. Also, we show that the subgraph vectors could be used for building a deep learning variant of Weisfeiler-Lehman graph kernel. Our experiments on several benchmark and large-scale real-world datasets reveal that subgraph2vec achieves significant improvements in accuracies over existing graph kernels on both supervised and unsupervised learning tasks. Specifically, on two realworld program analysis tasks, namely, code clone and malware detection, subgraph2vec outperforms state-of-the-art kernels by more than 17% and 4%, respectively.

研究动机与目标

阐明在大规模图中捕获语义子结构依赖性的潜在子图表示的需求。
开发一种无监督方法，利用局部邻域信息学习有根子图的潜在表示。
展示子图向量如何提升下游任务，如图分类、聚类、链接预测和社区检测。
展示 subgraph2vec 如何支撑 Weisfeiler-Lehman 图核的深度学习变体。
在基准数据集和大规模真实世界数据集上验证该方法的有效性，包括代码克隆和恶意软件检测任务。

提出的方法

利用节点周围的局部邻域信息，在无监督的方式下学习潜在的有根子图表示。
将有根子图表示为连续向量，适用于输入到卷积神经网络(CNN)、支持向量机(SVM)和关系聚类算法。
将子图向量整合到 Weisfeiler-Lehman 图核的深度学习变体中。
在多样化任务上评估学习到的表示，以证明相较于现有图核的准确性提升。
展示对大规模图和真实世界数据集的适用性。

实验结果

研究问题

RQ1是否可以在无监督的方式下从大型图中学习到有根子图的潜在表示？
RQ2与现有图核相比，子图向量是否能提升图分类、聚类、链路预测和社区检测的性能？
RQ3能否利用子图表示创建 Weisfeiler-Lehman 图核的深度学习变体？
RQ4在基准和真实世界数据集上的实验是否显示出 subgraph2vec 的显著准确性提升？
RQ5具体来说，基于 subgraph2vec 的方法在代码克隆和恶意软件检测任务中的表现如何？

主要发现

子图向量可以与诸如 CNN 和 SVM 的分类器结合，以实现比传统核更高的准确性。
该方法实现了 Weisfeiler-Lehman 图核的一个深度学习变体。
在基准和大规模真实世界数据集上的实验显示，在监督和无监督任务中，相较于现有图核有显著的准确性提升。
在代码克隆任务中，subgraph2vec 的表现领先于最先进的核方法超过 17%。
在恶意软件检测任务中，subgraph2vec 的表现领先于最先进的核方法超过 4%。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。