[논문 리뷰] Tackling Over-Smoothing for General Graph Convolutional Networks
본 논문은 일반 GCN에서의 과도한 평활화(over-smoothing)를 분석하고, 깊은 네트워크에서 큐보이드로의 수렴을 증명하며, 이 문제를 DropEdge를 통해 이론 및 실험으로 완화하는 방법을 제시한다.
Increasing the depth of GCN, which is expected to permit more expressivity, is shown to incur performance detriment especially on node classification. The main cause of this lies in over-smoothing. The over-smoothing issue drives the output of GCN towards a space that contains limited distinguished information among nodes, leading to poor expressivity. Several works on refining the architecture of deep GCN have been proposed, but it is still unknown in theory whether or not these refinements are able to relieve over-smoothing. In this paper, we first theoretically analyze how general GCNs act with the increase in depth, including generic GCN, GCN with bias, ResGCN, and APPNP. We find that all these models are characterized by a universal process: all nodes converging to a cuboid. Upon this theorem, we propose DropEdge to alleviate over-smoothing by randomly removing a certain number of edges at each training epoch. Theoretically, DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by dimension collapse. Experimental evaluations on simulated dataset have visualized the difference in over-smoothing between different GCNs. Moreover, extensive experiments on several real benchmarks support that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs.
연구 동기 및 목표
- Explain why deeper GCNs suffer from over-smoothing across generic GCNs, GCN-b, ResGCN, and APPNP.
- Characterize the asymptotic behavior of deep GCNs under non-linearity.
- Propose DropEdge to mitigate over-smoothing and analyze its theoretical impact.
- Demonstrate empirical improvements of DropEdge on multiple node classification benchmarks.
제안 방법
- Define the augmented normalized adjacency and the subspace M spanned by its top eigenvectors.
- Prove a general over-smoothing theorem showing convergence to a cuboid O(M, r) for several GCN variants.
- Introduce DropEdge: randomly dropping edges with probability p during training and re-renormalizing.
- Provide theoretical bounds showing how DropEdge increases the spectral radius bounds and slows convergence to over-smoothing.
- Show that DropEdge acts as a data augmentation and a message-passing reducer.
- Evaluate DropEdge on shallow and deep GCN backbones across multiple benchmarks.
실험 결과
연구 질문
- RQ1Why do general deep GCNs converge to a low-variance representation (over-smoothing) as depth increases?
- RQ2How do variants like GCN-b, ResGCN, and APPNP differ in their convergence to subspaces or cuboids?
- RQ3Can a simple edge-dropping strategy (DropEdge) theoretically and empirically alleviate over-smoothing across these models?
- RQ4What is the impact of DropEdge on model expressivity and stability during training?
주요 결과
- All four models converge to a cuboid rather than a subspace under infinite depth, with radius r depending on the model.
- GCN without bias converges to a zero-radius subspace, confirming over-smoothing.
- GCN with bias and APPNP converge to a cuboid with non-zero radius, slowing information loss.
- DropEdge increases the effective spectral bounds (lambda) and reduces the speed of over-smoothing.
- DropEdge serves as both a regularizer against overfitting and a mechanism to preserve information by enlarging the effective representational space.
- Empirical results show DropEdge improves performance on both shallow and deep GCN variants across several node classification benchmarks.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.