QUICK REVIEW

[论文解读] LRGB: Long Range Graph Benchmark

Vijay Prakash Dwivedi, Ladislav Rampášek|arXiv (Cornell University)|Jun 16, 2022

Advanced Graph Neural Networks被引用 31

一句话总结

本论文介绍了长程图基准（LRGB），包含五个数据集（PascalVOC-SP、COCO-SP、PCQM-Contact、Peptides-func、Peptides-struct），旨在需要长程交互推理的任务，并显示在这些任务上，完全连接的图变换器优于局部的MP-GNNs。

ABSTRACT

Graph Neural Networks (GNNs) that are based on the message passing (MP) paradigm exchange information between 1-hop neighbors to build node representations at each layer. In principle, such networks are not able to capture long-range interactions (LRI) that may be desired or necessary for learning a given task on graphs. Recently, there has been an increasing interest in development of Transformer-based methods for graphs that can consider full node connectivity beyond the original sparse structure, thus enabling the modeling of LRI. However, MP-GNNs that simply rely on 1-hop message passing often fare better in several existing graph benchmarks when combined with positional feature representations, among other innovations, hence limiting the perceived utility and ranking of Transformer-like architectures. Here, we present the Long Range Graph Benchmark (LRGB) with 5 graph learning datasets: PascalVOC-SP, COCO-SP, PCQM-Contact, Peptides-func and Peptides-struct that arguably require LRI reasoning to achieve strong performance in a given task. We benchmark both baseline GNNs and Graph Transformer networks to verify that the models which capture long-range dependencies perform significantly better on these tasks. Therefore, these datasets are suitable for benchmarking and exploration of MP-GNNs and Graph Transformer architectures that are intended to capture LRI. arXiv, Papers with Code

研究动机与目标

阐明需要能够进行长程交互（LRI）推理的基准的必要性。
介绍强调跨领域（视觉与化学）的长程依赖的五个真实世界数据集。
在这些数据集上对基线模型进行基准比较，包括局部 MP-GNNs 与完全连接的图变换器。
分析表明何时具备 LRI 能力的模型相对于传统 MP-GNNs 更具优势的因素。
提供基线和见解，以指导未来关注 LRI 的图结构模型。

提出的方法

提出五个数据集（PascalVOC-SP、COCO-SP、PCQM-Contact、Peptides-func、Peptides-struct）作为 LRI 基准。
通过 SLIC 超像素（PascalVOC-SP、COCO-SP）或分子/肽表示来构建图，或使用分子/肽表示来构建图。
通过图大小、任务性质以及全局结构对任务的贡献等因素来表征数据集，以证明 LRI 的相关性。
在固定参数预算（约 500k）下评估基线模型，包括局部 MP-GNNs（GCN、GCNII、GINE、GatedGCN）和图变换器（Transformer+LapPE、SAN、SAN+RWSE，带/不带 Laplacian PE）。
将 PCQM-Contact 构建为远端-接触任务，以在三维空间中强化长程交互理解。

实验结果

研究问题

RQ1具备长程信息传播能力的模型在所提出的 LRGB 任务上是否优于局部 MP-GNNs？
RQ2图的大小、任务性质和全局结构如何影响对长程建模的需求？
RQ3位置编码或全局编码是否提升 MP-GNN 在 LRGB 任务上的表现？
RQ4面向 LRI 的图基准的局限性与未来方向是什么？

主要发现

在相同参数预算下，完全连通的图变换器通常在 LRGB 数据集上优于局部的 MP-GNNs。
具有有限感受野的基线局部 MP-GNNs 在需要长程信号的大图上往往拟合不足或表现不佳。
位置编码和全局结构信息可以提升 MP-GNN 在 LRGB 任务上的表现，但具备 LRI 能力的基于变换器的模型获得的提升更明显。
The Peptides 数据集展示了具有较大直径的大图，强调需要能够捕捉远距离交互的模型。
PascalVOC-SP 与 COCO-SP 相比经典基准显示出更大的图直径和平均路径长度，凸显 LRI 的相关性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。