Skip to main content
QUICK REVIEW

[论文解读] AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models

Ke Sun, Zhanxing Zhu|arXiv (Cornell University)|Aug 14, 2019
Advanced Graph Neural Networks参考文献 40被引用 32
一句话总结

AdaGCN 引入一个 AdaBoost 驱动、类似循环的深度图神经网络,能够通过逐层自适应加权从多跳邻域聚合信息,在稀疏张量计算方面达到最先进水平,同时降低计算量。

ABSTRACT

The design of deep graph models still remains to be investigated and the crucial part is how to explore and exploit the knowledge from different hops of neighbors in an efficient way. In this paper, we propose a novel RNN-like deep graph neural network architecture by incorporating AdaBoost into the computation of network; and the proposed graph convolutional network called AdaGCN~(Adaboosting Graph Convolutional Network) has the ability to efficiently extract knowledge from high-order neighbors of current nodes and then integrates knowledge from different hops of neighbors into the network in an Adaboost way. Different from other graph neural networks that directly stack many graph convolution layers, AdaGCN shares the same base neural network architecture among all ``layers'' and is recursively optimized, which is similar to an RNN. Besides, We also theoretically established the connection between AdaGCN and existing graph convolutional methods, presenting the benefits of our proposal. Finally, extensive experiments demonstrate the consistent state-of-the-art prediction performance on graphs across different label rates and the computational advantage of our approach AdaGCN~\footnote{Code is available at \url{https://github.com/datake/AdaGCN}.}

研究动机与目标

  • Motivate deep graph models to effectively exploit high-order neighbor information beyond shallow GCNs.
  • Propose AdaGCN, a recurrent-like architecture thatAdaBoosts across layers to integrate multi-hop knowledge.
  • Show theoretical connections between AdaGCN and existing propagation methods (PPNP/APPNP) and justify adaptive layering.
  • Demonstrate state-of-the-art predictive performance across datasets and label regimes with computational advantages.

提出的方法

  • Replace stacked nonlinear layers with a sequence of base classifiers f_theta^(l) that operate on A^l X to capture l-hop information.
  • Use an AdaBoost (SAMME.R) framework to weight and combine the base classifiers adaptively, updating node weights based on misclassifications.
  • Each base classifier uses a non-linear f_theta (e.g., a two-layer MLP) on the precomputed A^l X, enabling efficient computation by separating sparse propagation from dense decoding.
  • AdaGCN computes A^l X progressively as A^l X = A · (A^{l-1} X), and aggregates predictions with C(A,X) = argmax_k sum_l alpha^(l) f_theta^(l)(A^l X).
  • Draw a connection to APPNP/PPNP: AdaGCN generalizes EMA-style propagation with adaptive, layer-specific parameters rather than fixed exponential weights and shared parameters.
  • Argue with MixHop that AdaGCN provides adaptive, non-linear, layer-wise mixing with boosting-based combination, offering theoretical guarantees via boosting theory.

实验结果

研究问题

  • RQ1Can AdaBoost-style iteration over graph hops improve information fusion from multi-order neighbors beyond traditional deep GCNs?
  • RQ2What are the theoretical relationships between AdaGCN, APPNP/PPNP, and MixHop in terms of propagation and expressive power?
  • RQ3Does AdaGCN maintain computational efficiency by avoiding dense sparse-tensor multiplications while achieving superior accuracy across varying label rates?
  • RQ4How does adapting layer-wise weights via SAMME.R impact generalization and robustness across datasets?

主要发现

  • AdaGCN achieves state-of-the-art accuracy on multiple datasets (CiteSeer, Cora, PubMed, MS Academic) compared to strong baselines.
  • AdaGCN maintains advantages in low-label regimes, showing more improvement than APPNP as label rates decrease.
  • AdaGCN exhibits competitive or superior performance while significantly reducing sparse tensor computations, yielding faster per-epoch training on larger datasets (e.g., Reddit).
  • The method can be interpreted as an adaptive form of APPNP, where layer-wise classifiers with different parameters are weighted by AdaBoost, rather than a fixed EMA scheme.
  • AdaGCN can represent general layer-wise neighborhood mixing, aligning with MixHop’s spirit but with boosting-based combination and non-linear per-layer transformations.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。