QUICK REVIEW

[论文解读] Uncertainty Quantification over Graph with Conformalized Graph Neural Networks

Kexin Huang, Ying Jin|arXiv (Cornell University)|May 23, 2023

Advanced Graph Neural Networks被引用 13

一句话总结

CF-GNN 将保角预测扩展到图结构数据，提供具有目标覆盖率的保证不确定性集合，并显著提高效率（最高可达 74%）。

ABSTRACT

Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Moreover, besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features.

研究动机与目标

解决图神经网络（GNN）缺乏严格不确定性估计的问题。
将 conformal prediction (CP) 扩展到传导式图设置，以获得保障的覆盖。
在图交换性和置换不变性下表征测试时的覆盖。
引入一个拓扑感知的校正以降低 CP 的低效。
在多样的图数据集上展示经验有效性和效率提升。

提出的方法

使用分割 conformal 预测，采用对校准样本和测试样本的顺序不变的非顺从分数 V（依据假设1）。
证明有效性：在图交换性下，CP 至少达到覆盖率 1-α，并精确表征测试时的覆盖（定理3）。
引入 CF-GNN：一个校正 GNN，对基础 GNN 的预测进行后处理，以利用邻居信息降低 CP 的低效。
用一个可微的低效损失对校正 GNN 进行训练，该损失模拟 CP 集合大小/区间，使用留出的校准分集。
提供两种常用的非顺从分数（用于分类的自适应预测集，Adaptive Prediction Set；用于回归的 CQR），作为 V 的示范。
保持与任何预训练 GNN 的兼容性，而不改变基础训练过程。

实验结果

研究问题

RQ1形态化预测能否为传导式图上的 GNN 提供有效、目标覆盖的不确定性集合？
RQ2在基于图的数据分割和置换不变性下，测试时的覆盖分布的精确形式是什么？
RQ3在不牺牲覆盖的前提下，如何降低图中 CP 的低效性（集合大小或区间长度）？
RQ4一个聚合邻域信息的拓扑感知校正是否能在多样数据集和不同 GNN 架构上提升 CP 的效率？
RQ5CF-GNN 的预测是否在网络特征上维持令人满意的（近似）条件覆盖？

主要发现

CF-GNN 在基线不具备达到目标的 UQ 方法无法达到目标的数据集上实现目标的经验边际覆盖，证明统计有效性。
预测集合/区间的大小显著降低（最高可达 74%），相比于基于 GNN 的直接 CP。
该方法保留置换不变性和图的交换性，支持有效的覆盖保证。
CF-GNN 在各种图特征上表现出令人满意的条件覆盖，表明实际可靠性。
效率提升在多种 GNN 架构上具有普遍性，超出实验所用的基础模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。