QUICK REVIEW

[论文解读] Heterogeneous Deep Graph Infomax

Yuxiang Ren, Бо Лю|arXiv (Cornell University)|Nov 19, 2019

Advanced Graph Neural Networks参考文献 44被引用 66

一句话总结

HDGI 是一种用于异质图的无监督图神经网络，通过在元路径诱导的语义间最大化局部-全局互信息，利用语义层面的注意力融合多条元路径。在无监督的情况下，它在节点分类和聚类任务上达到先进水平。

ABSTRACT

Graph representation learning is to learn universal node representations that preserve both node attributes and structural information. The derived node representations can be used to serve various downstream tasks, such as node classification and node clustering. When a graph is heterogeneous, the problem becomes more challenging than the homogeneous graph node learning problem. Inspired by the emerging information theoretic-based learning algorithm, in this paper we propose an unsupervised graph neural network Heterogeneous Deep Graph Infomax (HDGI) for heterogeneous graph representation learning. We use the meta-path structure to analyze the connections involving semantics in heterogeneous graphs and utilize graph convolution module and semantic-level attention mechanism to capture local representations. By maximizing local-global mutual information, HDGI effectively learns high-level node representations that can be utilized in downstream graph-related tasks. Experiment results show that HDGI remarkably outperforms state-of-the-art unsupervised graph representation learning methods on both classification and clustering tasks. By feeding the learned representations into a parametric model, such as logistic regression, we even achieve comparable performance in node classification tasks when comparing with state-of-the-art supervised end-to-end GNN models.

研究动机与目标

激励并解决异质图的无监督表示学习。
建模多类型节点/边如何通过元路径传达丰富语义。
提出基于MI的目标以在没有标签的情况下学习根表示。
利用元路径特定编码器和语义层级注意力来融合语义。
展示在节点分类和聚类任务中对比基线的有效性。

提出的方法

定义异质图及基于元路径的多语义邻接矩阵。
在每个同构子图上使用 GCN 或 GAT 计算元路径特定的节点表示。
用语义层级注意力聚合语义以获得联合节点表示 H。
使用全局编码器（平均、池化或 Set2vec）导出图级摘要向量 s。
使用带负采样的判别器 D 在局部节点表示 H 与全局摘要 s 之间最大化互信息。
通过打乱节点特征来生成负样本，同时保持元路径邻接不变，以形成 Neg 对；使用对数损失的二进制交叉熵来下界 MI。
通过反向传播实现端到端训练，以在没有标签的情况下学习表示。

实验结果

研究问题

RQ1基于MI的目标是否能够有效地从同构图扩展到异质图？
RQ2元路径加语义层级注意力是否能够在异质图中捕捉多样语义，从而获得鲁棒表示？
RQ3相较于有监督 GNN 及其他无监督方法，HDGI 在无监督节点分类和聚类中的表现如何？
RQ4不同全局编码器（平均、池化、Set2vec）对学习到的表示有什么影响？
RQ5负采样质量是否会影响异质设置中的互信息最大化？

主要发现

数据集	训练	指标	原始	M2V	DW	GCN	RGCN	GAT	HAN	DW+F	DGI	HDGI-A	HDGI-C
ACM	20%	Micro-F1	0.8590	0.6125	0.5503	0.9250	0.5766	0.9178	0.9267	0.8785	0.9104	0.9178	0.9227
ACM	20%	Macro-F1	0.8585	0.6158	0.5582	0.9248	0.5801	0.9172	0.9268	0.8789	0.9104	0.9170	0.9232
DBLP	20%	Micro-F1	0.7552	0.6985	0.2805	0.8192	0.1932	0.8244	0.8992	0.7163	0.8975	0.9062	0.9175

HDGI 在跨多个异质数据集的节点分类和聚类上超过了最先进的无监督方法。
HDGI-C 和 HDGI-A 在节点分类上取得强劲结果，常常超越有监督和 HAN 基线。
基于元路径的注意力有效整合来自 PAP、PSP、MAM、MDM、MKM 等的语义，提升表示质量。
结合学习到的判别器的MI目标鼓励保留全局图信息的表示，同时融入局部属性。
当与简单的下游分类器结合使用时，HDGI 的无监督表示与端到端有监督 GNN 模型相比具有竞争力，甚至优于它们。
对 ACM、DBLP 与 IMDB 数据集的实验表明 HDGI 在不同 HG 结构和元数据下具有鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。