QUICK REVIEW

[Paper Review] Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning

Qimai Li, Zhichao Han|arXiv (Cornell University)|Jan 22, 2018

Advanced Graph Neural Networks395 citations

TL;DR

The paper analyzes why GCNs work by showing they perform Laplacian smoothing, identifies limits like over-smoothing and validation needs, and proposes co-training with random walks and self-training to boost semi-supervised learning with very few labels.

ABSTRACT

Many interesting problems in machine learning are being revisited with new deep learning tools. For graph-based semisupervised learning, a recent important development is graph convolutional networks (GCNs), which nicely integrate local vertex features and graph topology in the convolutional layers. Although the GCN model compares favorably with other state-of-the-art methods, its mechanisms are not clear and it still requires a considerable amount of labeled data for validation and model selection. In this paper, we develop deeper insights into the GCN model and address its fundamental limits. First, we show that the graph convolution of the GCN model is actually a special form of Laplacian smoothing, which is the key reason why GCNs work, but it also brings potential concerns of over-smoothing with many convolutional layers. Second, to overcome the limits of the GCN model with shallow architectures, we propose both co-training and self-training approaches to train GCNs. Our approaches significantly improve GCNs in learning with very few labels, and exempt them from requiring additional labels for validation. Extensive experiments on benchmarks have verified our theory and proposals.

Motivation & Objective

Clarify how GCNs operate in semi-supervised learning and why they succeed or fail.
Analyze the limits of shallow and deep GCN architectures due to Laplacian smoothing.
Propose training-time strategies (co-training with random walks and self-training) to improve performance with few labels.
Demonstrate empirical gains on standard graph-based benchmarks without extra validation data.

Proposed method

Show that GCN convolution is a special form of symmetric Laplacian smoothing.
Derive the propagation rule H^{(l+1)} = sigma( D~^{-1/2} A~ D~^{-1/2} H^{(l)} Theta^{(l)} ).
Explain why Laplacian smoothing encourages same-cluster feature similarity and can cause over-smoothing across layers.
Propose co-training with random walks (ParWalks) to inject global graph structure into training.
Propose self-training to expand the labeled set using GCN predictions.
Combine Co-Training and Self-Training (Union and Intersection) to robustly expand labels without extra validation data.
Provide a heuristic lower bound for required labels eta via (d_hat)^{tau} * eta ~ n to gauge label needs.

Experimental results

Research questions

RQ1Why do Graph Convolutional Networks perform well in semi-supervised learning?
RQ2What are the fundamental limits of GCNs in shallow vs. deep architectures (e.g., over-smoothing, propagation of labels)?
RQ3Can we design training strategies that mitigate these limits and reduce or remove the need for labeled validation data?
RQ4Do co-training with random walks and self-training improve GCN performance with very few labels, and how do Union/Intersection strategies compare?
RQ5How do these methods perform on standard graph benchmarks (Cora, CiteSeer, PubMed) with small labeling rates?

Key findings

GCN convolution acts as a form of Laplacian smoothing, explaining why GCNs mix neighbor information to ease classification.
Over-smoothing occurs with too many layers, causing indistinguishable features across connected components; very deep GCNs are hard to train.
A two-layer GCN often yields the best practical performance among deep variants assessed.
Co-training with a random-walk model (ParWalks) expands the labeled set using global graph structure, enhancing GCN training without extra validation data.
Self-training augments labeled data by adding high-confidence GCN predictions, improving robustness when graph structure is limited.
Union and Intersection strategies for label expansion generally improve performance, with Union providing the broadest gains and Intersection filtering potentially unnecessary labels.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.