QUICK REVIEW

[论文解读] Structure-Aware Transformer for Graph Representation Learning

Dexiong Chen, Leslie O’Bray|arXiv (Cornell University)|Feb 7, 2022

Advanced Graph Neural Networks被引用 46

一句话总结

Structure-Aware Transformer (SAT) 通过将结构感知自注意力与图变换组合，结合局部子图信息，在通过将 GNN 子图提取与 Transformer 注意力相结合，在几个图基准数据集上实现了最先进的结果。

ABSTRACT

The Transformer architecture has gained growing attention in graph representation learning recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by avoiding their strict structural inductive biases and instead only encoding the graph structure via positional encoding. Here, we show that the node representations generated by the Transformer with positional encoding do not necessarily capture structural similarity between them. To address this issue, we propose the Structure-Aware Transformer, a class of simple and flexible graph Transformers built upon a new self-attention mechanism. This new self-attention incorporates structural information into the original self-attention by extracting a subgraph representation rooted at each node before computing the attention. We propose several methods for automatically generating the subgraph representation and show theoretically that the resulting representations are at least as expressive as the subgraph representations. Empirically, our method achieves state-of-the-art performance on five graph prediction benchmarks. Our structure-aware framework can leverage any existing GNN to extract the subgraph representation, and we show that it systematically improves performance relative to the base GNN model, successfully combining the advantages of GNNs and Transformers. Our code is available at https://github.com/BorgwardtLab/SAT.

研究动机与目标

解决标准 GNNs 在图表示学习中的局限性（表达能力、过度平滑、过度挤压）。
超越纯属性基注意力，通过在 Transformer 注意力中嵌入显式结构信息。
重用或插入任意现有的 GNN 以提取子图表示并提升整体性能。
在表达能力方面提供理论保证，并在多样的图任务上展示实证收益。

提出的方法

将 Transformer 自注意力重新表述为核平滑器，并扩展指数核以通过子图来考虑局部图结构。
引入 SA-attn，一个结构感知的注意力，使用每个节点的子图表示 S_G(v) 并使用核 κ_graph 比较子图：SA-attn(v) = sum_u κ_graph(S_G(v), S_G(u)) / sum_w κ_graph(S_G(v), S_G(w)) f(x_u).
定义结构提取器 φ(u,G) 以生成子图表示，包括 k-subtree GNN 提取器和 k-subgraph GNN 提取器，可选地与原始节点特征串联。
允许结构提取器为任意可微分模型（GNNs、图核函数），并通过具备能力的提取器支持边属性。
将结构感知注意力整合到具有跳跃连接、FFN、层归一化以及基于度的跳跃因子以缓解高连通节点的 Transformer 块中。
将 SAT 与绝对编码（如 RWPE）结合以获得互补信息，进一步提升性能。

实验结果

研究问题

RQ1能否通过显式结构感知自注意力捕捉节点之间的结构相似性，超越绝对位置编码所提供的能力？
RQ2使用不同的子图提取器（k-subtree 与 k-subgraph，以及基础 GNN 的选择）对预测性能有何影响？
RQ3SAT 是否对相对于结构提取器使用的子图表示具有表达能力的理论保证？
RQ4与现有最先进的 GNN 和图形 Transformer 在图和节点预测基准上相比，SAT 的表现如何？
RQ5SAT 是否为一种实用的增强方法，能够通过提供结构感知注意力来提升任何基础 GNN？

主要发现

SAT 在五个图预测基准上实现了最先进的性能，超越了 GNNs 和图 Transformer。
替换或增强标准自注意力为结构感知的 SA-attn 可获得的表示至少与底层子图表示一样具有表达力。
k-subtree SAT 与 k-subgraph SAT 在多个数据集上稳定提升了基础 GNN，且 k-subgraph 往往提供更高的表达能力。
通过 SAT 引入结构信息相对于仅使用 RWPE 的绝对编码的 vanilla Transformer 提供了显著的增益。
理论结果表明 SA-attn 相对于子图提取器保持表达能力，并且一个基于 Lipschitz 的界定将节点表示的相似性与子图和特征的相似性联系起来（定理 1）。
在 OGB 数据集（CODE2、PPA）上的实验结果显示强劲的性能提升，SAT 变体优于若干基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。