QUICK REVIEW

[论文解读] Adaptive Sampling Towards Fast Graph Representation Learning

Wenbing Huang, Tong Zhang|arXiv (Cornell University)|Sep 14, 2018

Advanced Graph Neural Networks被引用 228

一句话总结

该论文提出层级自适应采样、方差降低训练和跳跃连接，以在大规模图上加速 GCN 的训练，相较于基线实现更快收敛和更高准确性。

ABSTRACT

Graph Convolutional Networks (GCNs) have become a crucial tool on learning representations of graph vertices. The main challenge of adapting GCNs on large-scale graphs is the scalability issue that it incurs heavy cost both in computation and memory due to the uncontrollable neighborhood expansion across layers. In this paper, we accelerate the training of GCNs through developing an adaptive layer-wise sampling method. By constructing the network layer by layer in a top-down passway, we sample the lower layer conditioned on the top one, where the sampled neighborhoods are shared by different parent nodes and the over expansion is avoided owing to the fixed-size sampling. More importantly, the proposed sampler is adaptive and applicable for explicit variance reduction, which in turn enhances the training of our method. Furthermore, we propose a novel and economical approach to promote the message passing over distant nodes by applying skip connections. Intensive experiments on several benchmarks verify the effectiveness of our method regarding the classification accuracy while enjoying faster convergence speed.

研究动机与目标

针对大规模图上图卷积网络（GCN）的可扩展性挑战进行动机说明与解决方案，原因在于邻域扩张带来的计算负担。
提出一个层级采样框架，在父结点之间共享采样的邻域以固定层大小并控制扩张。
引入一个在可处理范围内最小化方差的自适应采样器，并将方差降低目标整合进训练过程。
通过跳跃连接增强长距离信息传递，以在不带来额外高计算量的情况下保留二阶邻近性。

提出的方法

将 GCN 的更新重新表述为期望，并用蒙特卡洛估计替代完整的邻域扩张。
开发逐层采样，其中对每一层执行一次采样，且上层的结点共享邻域。
设计一个从自相关函数 g(x(u_j)) 推导出的自适应采样器 q(u_j)，以近似方差最小化分布（Eq. 9），并将方差降低整合进混合损失。
引入跳跃连接方案，从 (l-1) 层重复使用结点，以在不直接计算 A^2 的情况下实现两跳邻域（Eq. 12–13）。
将采样器与现有方法（GraphSAGE、FastGCN）联系起来，并讨论适应层级框架的注意力启发变体（类似 GAT 的变体）。
提供一个归纳学习设置，并在标准图数据集上进行经验评估（Cora、Citeseer、Pubmed、Reddit）。

实验结果

研究问题

RQ1层级采样与共享邻域是否能够在保持或提升标准图基准上训练 GCN 的速度？
RQ2自适应、方差降低的采样器在稳定性和收敛性方面是否优于节点级或独立同分布的分层采样？
RQ3引入跳跃连接以保留二阶邻近性是否改善收敛和预测性能？
RQ4与现有基于采样或基于注意力的图模型相比，所提方法在基准数据集上的表现如何？

主要发现

方法	Cora	Citeseer	Pubmed	Reddit
Full	0.8664±0.0011	0.7934±0.0026	0.9022±0.0008	0.9568±0.0069
IID	0.8506±0.0048	0.7387±0.0078	0.8200±0.0114	0.8611±0.0437
Node-Wise	0.8202±0.0133	0.7734±0.0081	0.9002±0.0017	0.9449±0.0026
Adapt (no vr)	0.8588±0.0062	0.7942±0.0022	0.9060±0.0024	0.9501±0.0047
Adapt	0.8744±0.0034	0.7966±0.0018	0.9060±0.0016	0.9627±0.0032

Adapt 在 Cora、Citeseer、Pubmed 和 Reddit 上的测试准确率高于强基线（例如：Cora 0.8744，Citeseer 0.7966，Pubmed 0.9060，Reddit 0.9627）。
Adapt 在报告结果中优于 Full GCN、IID、GraphSAGE 和 FastGCN 等基线，具有更快的收敛速度（每轮训练时间）和更好的稳定性。
在 Cora 和 Reddit 上，带有自适应采样器（λ>0）的方差降低效果优于移除方差项（λ=0）的情况；Citeseer 展示出方差影响较小。
跳跃连接显著加速收敛（例如将收敛轮次从约 150 降至约 100 以 Cora 为例），对最终准确率的影响适度。
显式的两跳采样变体（使用 A^2）可以进一步提高准确率，尽管跳跃连接在大规模图上提供了更具计算友好的替代方案。
与 IID 和节点级采样相比，带条件依赖的层级采样能够捕捉跨层相关性，从而实现更快且更稳定的训练。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。