[论文解读] Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View
论文提出 MAD 和 MADGap 指标来量化 GNN 的平滑和过度平滑,显示由拓扑驱动的信息对噪声比导致过度平滑,并提出 MADReg 与 AdaEdge 以缓解它,在多数据集和模型上得到验证。
Graph Neural Networks (GNNs) have achieved promising performance on a wide range of graph-based tasks. Despite their success, one severe limitation of GNNs is the over-smoothing issue (indistinguishable representations of nodes in different classes). In this work, we present a systematic and quantitative study on the over-smoothing issue of GNNs. First, we introduce two quantitative metrics, MAD and MADGap, to measure the smoothness and over-smoothness of the graph nodes representations, respectively. Then, we verify that smoothing is the nature of GNNs and the critical factor leading to over-smoothness is the low information-to-noise ratio of the message received by the nodes, which is partially determined by the graph topology. Finally, we propose two methods to alleviate the over-smoothing issue from the topological view: (1) MADReg which adds a MADGap-based regularizer to the training objective;(2) AdaGraph which optimizes the graph topology based on the model predictions. Extensive experiments on 7 widely-used graph datasets with 10 typical GNN models show that the two proposed methods are effective for relieving the over-smoothing issue, thus improving the performance of various GNN models.
研究动机与目标
- 量化在不同数据集和模型中 GNN 的平滑和过度平滑行为。
- 确定信息对噪声比在驱动过度平滑中的作用。
- 显示图的拓扑影响信息对噪声比及模型性能。
- 提出基于拓扑的缓解过度平滑的方法并验证其有效性。
提出的方法
- 定义 MAD 以使用最终层嵌入的余弦距离来衡量节点表示的平滑度。
- 将 MAD 扩展为 MADGap,通过对比远端节点对与相邻节点对的 MAD 来量化过度平滑。
- 分析 MADGap 与跨数据集和模型的模型性能之间的相关性。
- 提出 MADReg,一种基于 MADGap 的正则化项,在训练期间鼓励信息丰富、噪声降低的消息传递。
- 提出 AdaEdge,一种自适应拓扑优化方法,在训练期间重新连边以偏向类内连接而非跨类连接。
实验结果
研究问题
- RQ1GNN 中导致过度平滑的因素是什么,如何进行量化?
- RQ2图拓扑如何影响信息对噪声比及随之的平滑?
- RQ3基于拓扑感知的干预(MADReg、AdaEdge)是否能缓解过度平滑并在不同架构上提升性能?
- RQ4MAD 与 MADGap 与模型性能之间在不同数据集和层级上的相关性有多强?
主要发现
- 当 GNN 深度增加时,MAD 值下降,表明平滑是 GNN 的固有属性。
- MADGap 与模型在不同模型和数据集上的准确率显著相关,验证其作为过度平滑度量的有效性。
- 更高的信息对噪声比对应较少的过度平滑和更好的预测。
- 根据标签去除类间边并添加类内边,则 MADGap 增大并提升性能。
- MADReg 与 AdaEdge 在 7 个数据集和 10 个 GNN 模型上有效缓解过度平滑并提升性能,尤其在高深度设置中。
- AdaEdge 更稳定地提升性能,当过度平滑严重时尤为明显,而 MADReg 在平滑增加时也能提供改进。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。