QUICK REVIEW

[论文解读] Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction

Seongjun Yun, Seoyoon Kim|arXiv (Cornell University)|Jun 9, 2022

Advanced Graph Neural Networks被引用 47

一句话总结

Neo-GNNs 学习图邻接矩阵中的结构特征，用于估计重叠邻域以进行链接预测，并在与基于特征的 GNNs 的自适应结合中，在 Open Graph Benchmark 数据集上实现了最先进的结果。

ABSTRACT

Graph Neural Networks (GNNs) have been widely applied to various fields for learning over graph-structured data. They have shown significant improvements over traditional heuristic methods in various tasks such as node classification and graph classification. However, since GNNs heavily rely on smoothed node features rather than graph structure, they often show poor performance than simple heuristic methods in link prediction where the structural information, e.g., overlapped neighborhoods, degrees, and shortest paths, is crucial. To address this limitation, we propose Neighborhood Overlap-aware Graph Neural Networks (Neo-GNNs) that learn useful structural features from an adjacency matrix and estimate overlapped neighborhoods for link prediction. Our Neo-GNNs generalize neighborhood overlap-based heuristic methods and handle overlapped multi-hop neighborhoods. Our extensive experiments on Open Graph Benchmark datasets (OGB) demonstrate that Neo-GNNs consistently achieve state-of-the-art performance in link prediction. Our code is publicly available at https://github.com/seongjunyun/Neo_GNNs.

研究动机与目标

动机：需要结构感知的链接预测，超越仅依赖节点特征平滑的 GNN。
引入 Neo-GNNs，从邻接矩阵学习结构特征并建模重叠的邻域。
开发用于多跳邻域的重叠感知聚合机制。
在一个端到端框架中实现结构分数（Neo-GNN）与传统基于特征的 GNN 分数的自适应融合。
在四个 OGB 链接预测数据集上展示最先进的性能。

提出的方法

结构特征生成器 F_theta 通过对节点和边使用两个多层感知机（MLP）（方程式 6）从邻接矩阵学习节点结构特征。
基于邻域重叠感知的聚合从 x_struct 构造对角矩阵 X_struct，并计算 Z = A X_struct，以捕获重叠的邻居信息（方程 7 和 8）。
扩展到多跳邻域，公式为 Z = g_Phi(∑_{l=1}^L beta^{l-1} A^l X_struct)（方程 9）。
通过可训练的 α 将结构分数预测与基于特征的 GNN 分数结合：y_hat_{ij} = alpha * sigma(z_i^T z_j) + (1 - alpha) * sigma(s(h_i, h_j))（方程 11）。
端到端训练，针对 Neo-GNN、结构分数和基于特征的 GNN 分数各自的三个 BCE 损失（方程 12）。
通过稀疏矩阵表示和预计算 A^l 项来提供可扩展计算（3.3 节中的复杂度讨论）。

实验结果

研究问题

RQ1从邻接矩阵学习结构特征是否能提升链接预测，超越依赖节点特征的传统 GNN？
RQ2如何将邻域重叠（包括多跳重叠）有效地纳入 GNN 进行链接预测？
RQ3自适应将结构分数与基于特征的 GNN 分数结合，是否在不同数据集上达到更优表现？
RQ4Neo-GNNs 在多大程度上能够泛化邻域重叠启发式方法（如 Common Neighbors、Adamic Adar、Resource Allocation）？

主要发现

方法	OGB-Ppa	OGB-Collab	OGB-Ddi	OGB-Citation2
Common Neighbors	27.65±0.00	50.06±0.00	17.73±0.00	76.20±0.00
Adamic Adar	32.45±0.00	53.00±0.00	18.61±0.00	76.12±0.00
Resource Allocation	49.33±0.00	52.89±0.00	6.23±0.00	76.20±0.00
Matrix Factorization	27.83±2.02	38.74±0.30	17.92±3.57	53.08±4.19
Node2Vec	17.24±0.76	41.36±0.69	21.95±1.58	53.47±0.12
MLP	0.47±0.05	19.98±0.96	N/A	28.99±0.16
GCN	16.98±1.33	47.01±0.79	44.60±8.87	84.79±0.24
GraphSAGE	13.93±2.38	48.60±0.46	48.01±9.02	82.64±0.01
JK-Net	11.40±2.04	48.84±0.83	57.98±6.88	OOM
GAT	OOM	44.89±1.23	29.51±6.40	OOM
SEAL	48.15±4.17	54.37±0.02	26.25±6.00	86.32±0.52
Neo-GNN	49.13±0.60	57.52±0.37	63.57±3.52	87.26±0.84

Neo-GNNs 在四个 OGB 数据集的链接预测上始终达到最先进的性能。
不使用 GCN 的 Neo-GNNs 在大多数数据集上优于基线 GNN，证明了结构信息的价值。
自适应组合系数 alpha 使在数据集特定条件下对结构与特征信号进行平衡，常常提升到单一组件无法达到的性能。
多跳重叠邻域（L>1）提升性能，衰减因子 beta 控制远距离跳的贡献。
Neo-GNNs 能够恢复并对齐邻域重叠启发式方法，在 OGB-PPA 上与 Resource Allocation、Adamic Adar、Common Neighbors 表现出较高的 Spearman 相关。
即使在 OGB-Citation2 上没有输入节点特征，Neo-GNNs 也取得最佳结果，凸显了结构感知学习的强大。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。