QUICK REVIEW

[论文解读] SCL-GNN: Towards Generalizable Graph Neural Networks via Spurious Correlation Learning

Yuxiang Zhang, Enyan Dai|arXiv (Cornell University)|Mar 9, 2026

Advanced Graph Neural Networks被引用 0

一句话总结

SCL-GNN 引入一个虚假相关学习模块，使用 HSIC 与 Grad-CAM 检测并缓解虚假特征–标签相关性，并通过双层优化与 GNN 共同优化，以提升 IID 与 OOD 泛化能力。

ABSTRACT

Graph Neural Networks (GNNs) have demonstrated remarkable success across diverse tasks. However, their generalization capability is often hindered by spurious correlations between node features and labels in the graph. Our analysis reveals that GNNs tend to exploit imperceptible statistical correlations in training data, even when such correlations are unreliable for prediction. To address this challenge, we propose the Spurious Correlation Learning Graph Neural Network (SCL-GNN), a novel framework designed to enhance generalization on both Independent and Identically Distributed (IID) and Out-of-Distribution (OOD) graphs. SCL-GNN incorporates a principled spurious correlation learning mechanism, leveraging the Hilbert-Schmidt Independence Criterion (HSIC) to quantify correlations between node representations and class scores. This enables the model to identify and mitigate irrelevant but influential spurious correlations effectively. Additionally, we introduce an efficient bi-level optimization strategy to jointly optimize modules and GNN parameters, preventing overfitting. Extensive experiments on real-world and synthetic datasets demonstrate that SCL-GNN consistently outperforms state-of-the-art baselines under various distribution shifts, highlighting its robustness and generalization capabilities.

研究动机与目标

证明在 IID 与 OOD 分布转移下，缓解伤害 GNN 泛化的虚假相关性的必要性。
提出一种有原则的虚假相关学习机制，量化并降低与目标无关的节点特征相关性。
开发一个双层优化策略，在避免过拟合的前提下联合训练虚假相关学习器和替代 GNN。
提供一个自监督辅助任务，以利用未标记数据并稳定学习过程。
在多样的真实世界与合成数据集上实证验证该框架，显示更强的鲁棒性。

提出的方法

通过虚假特征集和稳定相关性来定义节点特征与标签之间的虚假相关。
用Hilbert-Schmidt Independence Criterion (HSIC) 量化模型分数与节点表示之间的无关性。
用 Grad-CAM 评估特征重要性，以指导相关性缓解。
引入一个结合 HSIC 与 Grad-CAM 信号的虚假相关学习损失，以引导权重更新。
实现一个自监督的虚假相关学习器，微调 GNN 权重以减少对虚假特征的依赖。
应用双层优化方案在训练 GNN 的同时训练虚假相关学习器，并控制过拟合。

Figure 1: An illustration of node classification task on an academic network for GNNs. (a) The task aims to classify the label $y$ of target node $v$ based on the compact graph $\mathcal{G}_{v}=\{\mathbf{A}_{v},\mathbf{X}_{v}$ } consisting of its neighbors. (b) Four existing relations and the notion

实验结果

研究问题

RQ1RQ1: SCL-GNN 在解决虚假相关性方面相对于最先进方法有何表现？
RQ2RQ2: SCL-GNN 对虚假相关损失权重(beta) 与学得的相关性有多敏感？
RQ3RQ3: 所提的双层优化是否有效缓解虚假相关问题并改善泛化？
RQ4RQ4: 这些机制是否能为跨数据集的虚假相关缓解提供有意义的洞见？

主要发现

SCL-GNN 在多数据集与不同骨干网络的 OOD 数据上持续优于基线方法。
提高虚假相关损失权重 beta 通常在一定程度上提升性能，达到某个点后可能出现过拟合。
消融研究表明每个学得的相关性都对鲁棒性有贡献，移除某些相关性会降低性能。
双层优化使测试准确率与训练准确率高度对齐，并且比非两层变体更有效。
可视化分析表明 SCL-GNN 通过降低虚假特征的中位权重并提高虚假特征与干净特征的权重方差，从而减少对虚假特征的依赖。

Figure 2: Illustration of SCL-GNN framework for generalizing GNN via spurious correlation learning. (a) and (b) illustrates the details of spurious correlation(abbreviated as SC) learner $f_{a}$ and backbone GNN model $f_{s}$ , respectively, (c) represent an overview of the framework, (d) and (e) re

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。