QUICK REVIEW

[论文解读] High Order Stochastic Graphlet Embedding for Graph-Based Pattern Recognition.

Anjan Dutta, Hichem Sahbi|arXiv (Cornell University)|Feb 1, 2017

Graph Theory and Algorithms参考文献 39被引用 18

一句话总结

本文提出高阶随机图谱嵌入（SGE），一种通过随机采样大图谱并哈希同构图谱，显式将图映射到高维向量空间的方法，以建模局部结构及其相互作用。当与支持向量机（SVM）结合时，SGE在基准数据集上显著提升了图分类的准确性。

ABSTRACT

Graph-based methods are known to be successful for pattern description and comparison purpose. However, a lot of mathematical tools are unavailable in graph domain, thus restricting the generic graph-based techniques to be applicable within the machine learning framework. A way to tackle this problem is graph embedding into high dimensional space in either an explicit or implicit manner. In this paper, we propose high order stochastic graphlet embedding (SGE) that explicitly embed a graph into a real vector space. Our main contribution includes a new stochastic search procedure that allows one to efficiently parse a given graph and extract or sample unlimitedly high order graphlets. We consider these graphlets with increasing size in order to model local features, as well as, their complex interactions. We also introduce or design graph hash functions with very low probability of collision to hash those sampled graphlets for partitioning them into sets of isomorphic ones and measure their distribution in large graph collections, which results in accurate graph descriptions. When combined with support vector machines, these high order graphlet-based descriptions have positive impact on the performance of graph-based pattern comparison and classification as corroborated through experiments on different standard benchmark databases.

研究动机与目标

解决图领域缺乏数学工具的问题，以克服机器学习技术在基于图的模式识别中应用的障碍。
通过实现图的显式高维表示，克服现有图嵌入方法的局限性。
开发一种高效的随机过程，用于从大规模图中采样任意高阶图谱。
设计低碰撞图哈希函数，以分组同构图谱，并捕捉其在大规模图集合中的分布。
通过高阶图谱建模复杂局部特征及其相互作用，提升图分类性能。

提出的方法

提出一种高效的随机搜索过程，用于从给定图中采样高阶图谱（按大小递增的子图）。
使用碰撞概率极低的图哈希函数，识别并分组同构图谱，以进行分布分析。
基于不同阶数下同构图谱的频率分布，构建图的向量表示。
将所得的高阶图谱特征与支持向量机（SVM）结合用于分类。
利用显式的向量空间嵌入，使图结构数据能够支持标准机器学习流程。
通过控制采样过程并利用有效的哈希技术最小化信息损失，确保方法的可扩展性与准确性。

实验结果

研究问题

RQ1随机采样高阶图谱是否能实现更具表达性且可扩展的图嵌入方法？
RQ2低碰撞图哈希函数在分组同构图谱以实现分布表示方面，其有效性如何？
RQ3与低阶或非嵌入方法相比，高阶图谱特征在多大程度上提升了图分类性能？
RQ4所提出的方法在模式识别的多样化图集合和基准数据集中是否具备泛化能力？
RQ5通过高阶图谱捕获的局部图结构之间复杂交互关系，在多大程度上促进了分类性能的提升？

主要发现

所提出的高阶随机图谱嵌入（SGE）在图分类任务中相比基线方法表现出更优的性能。
采用随机采样策略，使高阶图谱的提取在计算成本上可接受，无显著负担。
低碰撞图哈希函数能有效分组同构图谱，保留结构信息，实现准确的分布建模。
当与SVM结合时，该方法在图基模式比较与分类任务中展现出积极影响。
在标准基准数据库上的实验验证了SGE在多样化图类型上的有效性与鲁棒性。
高阶图谱能够捕捉复杂的局部交互关系，从而生成比低阶图谱更具判别力的图表示。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。