QUICK REVIEW

[논문 리뷰] Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View

Deli Chen, Yankai Lin|arXiv (Cornell University)|2019. 09. 07.

Advanced Graph Neural Networks참고 문헌 24인용 수 79

한 줄 요약

이 논문은 GNN의 평활화(smoothing) 및 과도한 평활화(over-smoothing)를 정량화하기 위해 MAD 및 MADGap 지표를 도입하고, 토폴로지에 의해 정보-잡음 비가 과도한 평활화를 유발한다는 것을 보여주며, 이를 완화하기 위해 MADReg와 AdaEdge를 제안하고, 다양한 데이터셋과 모델에서 검증한다.

ABSTRACT

Graph Neural Networks (GNNs) have achieved promising performance on a wide range of graph-based tasks. Despite their success, one severe limitation of GNNs is the over-smoothing issue (indistinguishable representations of nodes in different classes). In this work, we present a systematic and quantitative study on the over-smoothing issue of GNNs. First, we introduce two quantitative metrics, MAD and MADGap, to measure the smoothness and over-smoothness of the graph nodes representations, respectively. Then, we verify that smoothing is the nature of GNNs and the critical factor leading to over-smoothness is the low information-to-noise ratio of the message received by the nodes, which is partially determined by the graph topology. Finally, we propose two methods to alleviate the over-smoothing issue from the topological view: (1) MADReg which adds a MADGap-based regularizer to the training objective;(2) AdaGraph which optimizes the graph topology based on the model predictions. Extensive experiments on 7 widely-used graph datasets with 10 typical GNN models show that the two proposed methods are effective for relieving the over-smoothing issue, thus improving the performance of various GNN models.

연구 동기 및 목표

다양한 데이터셋과 모델에 걸친 GNN의 평활화 및 과도한 평활화 동작을 정량화한다.
정보-노이즈 비가 과도한 평활화를 이끄는 역할을 규명한다.
그래프 토폴로지가 정보-노이즈 비와 모델 성능에 영향을 준다는 것을 보인다.
토폴로지 기반 방법을 제안하여 과도한 평활화를 완화하고 그 효과를 검증한다.

제안 방법

최종 층 임베딩의 코사인 거리(cosine distance)를 사용하여 노드 표현의 평활성을 측정하기 위해 MAD를 정의한다.
원격 노드 쌍과 인접 노드 쌍의 MAD를 비교하여 과도한 평활화를 정량화하는 MADGap로 MAD를 확장한다.
데이터셋과 모델 전반에 걸친 MADGap와 모델 성능 간의 상관관계를 분석한다.
MADGap를 기반으로 한 정규화 항인 MADReg를 제안하여 학습 중 정보가 풍부하고 잡음이 감소된 메시지를 유도한다.
훈련 중 클래스 간 연결보다 클래스 내 연결을 우선하도록 간선 재배치하는 적응형 토폴로지 최적화 방법 AdaEdge를 제안한다.

실험 결과

연구 질문

RQ1GNN에서 과도한 평활화를 무엇이 야기하며 그것을 어떻게 정량화할 수 있는가?
RQ2그래프 토폴로지가 정보-노이즈 비 및 그에 따른 평활화에 어떤 영향을 미치는가?
RQ3토폴로지 인지적 개입(MADReg, AdaEdge)이 과도한 평활화를 완화하고 다양한 아키텍처에서 성능을 향상시킬 수 있는가?
RQ4MAD와 MADGap가 데이터셋과 계층 전반에서 모델 성능과 얼마나 잘 상관하는가?

주요 결과

MAD 값은 GNN 깊이가 증가할수록 감소하며, 이는 평활화가 GNN의 고유한 특성임을 시사한다.
MADGap는 모델 및 데이터셋에 걸친 정확도와 유의하게 상관되어 과도한 평활도의 척도로서의 타당성을 확인한다.
정보-노이즈 비가 높을수록 과도한 평활화가 덜 발생하고 예측이 더 좋다.
레이블에 기반해 클래스 간 간선을 제거하고 클래스 내 간선을 추가하면 MADGap가 증가하고 성능이 향상된다.
MADReg와 AdaEdge는 특히 심층 설정에서 7개의 데이터셋과 10개의 GNN 모델에 걸쳐 과도한 평활화를 효과적으로 완화하고 성능을 향상시킨다.
과도한 평활화가 심각할 때 AdaEdge의 성능 향상이 더 일관되게 나타나고, 평활화가 증가할수록 MADReg가 개선을 뒷받침한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.