QUICK REVIEW

[논문 리뷰] Structure-Aware Transformer for Graph Representation Learning

Dexiong Chen, Leslie O’Bray|arXiv (Cornell University)|2022. 02. 07.

Advanced Graph Neural Networks인용 수 46

한 줄 요약

Structure-Aware Transformer (SAT)은 그래프 트랜스포머에 구조 인식 셀프 어텐션을 보강하여 로컬 서브그래프 정보를 통합하고, GNN 기반 서브그래프 추출과 Transformer 어텐션을 결합함으로써 여러 그래프 벤치마크에서 최첨단 결과를 달성합니다.

ABSTRACT

The Transformer architecture has gained growing attention in graph representation learning recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by avoiding their strict structural inductive biases and instead only encoding the graph structure via positional encoding. Here, we show that the node representations generated by the Transformer with positional encoding do not necessarily capture structural similarity between them. To address this issue, we propose the Structure-Aware Transformer, a class of simple and flexible graph Transformers built upon a new self-attention mechanism. This new self-attention incorporates structural information into the original self-attention by extracting a subgraph representation rooted at each node before computing the attention. We propose several methods for automatically generating the subgraph representation and show theoretically that the resulting representations are at least as expressive as the subgraph representations. Empirically, our method achieves state-of-the-art performance on five graph prediction benchmarks. Our structure-aware framework can leverage any existing GNN to extract the subgraph representation, and we show that it systematically improves performance relative to the base GNN model, successfully combining the advantages of GNNs and Transformers. Our code is available at https://github.com/BorgwardtLab/SAT.

연구 동기 및 목표

그래프 표현 학습에서 표준 GNN의 한계(표현력, 과도한 평활화, 과도한 압축)를 해결한다.
속성 기반 주의에만 의존하기를 넘어서 명시적 구조 정보를 Transformer 주의에 포함시킨다.
기존의 어떤 GNN이든 재사용하거나 연결하여 서브그래프 표현을 추출하고 전반적인 성능을 개선한다.
표현력에 대한 이론적 보장을 제공하고 다양한 그래프 과제에서 실증적 이득을 보여준다.

제안 방법

Transformer's self-attention을 커널 스무더로 재정의하고 서브그래프를 통해 로컬 그래프 구조를 반영하도록 지수 커널을 확장한다.
구조 인식 어텐션 SA-attn을 도입하여 각 노드에 대해 서브그래프 표현 S_G(v)를 사용하고 서브그래프를 비교하는 커널 κ_graph를 도입: SA-attn(v) = sum_u κ_graph(S_G(v), S_G(u)) / sum_w κ_graph(S_G(v), S_G(w)) f(x_u).
구조 추출기 φ(u,G)를 정의하여 서브그래프 표현을 생성하고, k-subtree GNN 추출기와 k-subgraph GNN 추출기를 포함하며, 원래 노드 특징과 선택적으로 연결할 수 있다.
구조 추출기가 임의의 미분 가능 모델(GNN, 그래프 커널 등)일 수 있도록 허용하고, 가능 한 추출기를 통해 간선 속성을 지원한다.
구조 인식 어텐션을 스킵 연결, FFN, 층 정규화, 그리고 지나치게 연결된 노드를 완화하기 위한 차수 기반 스킵 팩터를 갖춘 Transformer 블록에 통합한다.
SAT를 절대 인코딩(RWPE 등)과 결합하여 보완 정보를 제공하고 성능을 더욱 향상시킨다.

실험 결과

연구 질문

RQ1명시적 구조 인식 셀프 어텐션이 절대 위치 인코딩이 제공하는 것 이상의 노드 간 구조적 유사성을 포착할 수 있는가?
RQ2다른 서브그래프 추출기(k-subtree 대 k-subgraph와 기본 GNN 선택)가 예측 성능에 미치는 영향은 무엇인가?
RQ3SAT가 구조 추출기가 사용하는 서브그래프 표현에 비례한 표현력에 대한 이론적 보장을 제공하는가?
RQ4그래프 및 노드 예측 벤치마크에서 SAT의 성능은 최첨단 GNNs 및 그래프 트랜스포머와 비교해 어떤가?
RQ5SAT가 구조 인식 어텐션을 제공함으로써 어떤 기저 GNN이라도 개선할 수 있는 실용적인 향상책인가?

주요 결과

SAT는 다섯 개의 그래프 예측 벤치마크에서 최첨단 성능을 달성하여 GNN과 그래프 트랜스포머를 모두 능가한다.
표준 자체 주의(self-attention)를 구조 인식 SA-attn로 대체하거나 보강하면 표현력이 기본 서브그래프 표현만큼이나 높아진다.
k-subtree SAT와 k-subgraph SAT 모두 다수의 데이터셋에서 기본 GNN을 지속적으로 개선하며, 보통 표현력이 더 높은 것은 k-subgraph이다.
SAT를 통해 구조 정보를 도입하면 RWPE 절대 인코딩만을 사용하는 일반 Transformer에 비해 상당한 이득이 있다.
이론적 결과는 SA-attn이 서브그래프 추출기에 비례한 표현력을 보존하고, Lipschitz 기반의 상한이 노드 표현의 유사성과 서브그래프 및 특징 유사성과의 관계를 보여준다(정리 1, Theorem 1).
OGB 데이터셋(CODE2, PPA)에 대한 실험 결과는 강한 성능 향상을 보여주며, SAT 변형이 여러 베이스라인을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.