Skip to main content
QUICK REVIEW

[论文解读] Uncertainty in Graph Neural Networks: A Survey

Fangxin Wang, Yuqing Liu|arXiv (Cornell University)|Mar 11, 2024
Neural Networks and Applications被引用 7
一句话总结

对图神经网络(GNNs)不确定性的一项综合综述,详细讨论不确定性的来源、量化方法、评估指标,以及如何在下游任务中利用不确定性。

ABSTRACT

Graph Neural Networks (GNNs) have been extensively used in various real-world applications. However, the predictive uncertainty of GNNs stemming from diverse sources such as inherent randomness in data and model training errors can lead to unstable and erroneous predictions. Therefore, identifying, quantifying, and utilizing uncertainty are essential to enhance the performance of the model for the downstream tasks as well as the reliability of the GNN predictions. This survey aims to provide a comprehensive overview of the GNNs from the perspective of uncertainty with an emphasis on its integration in graph learning. We compare and summarize existing graph uncertainty theory and methods, alongside the corresponding downstream tasks. Thereby, we bridge the gap between theory and practice, meanwhile connecting different GNN communities. Moreover, our work provides valuable insights into promising directions in this field.

研究动机与目标

  • 识别并分类GNNs中的预测不确定性来源(凭据性、知识性、分布性及其他)。
  • 回顾在GNNs中量化不确定性的方法,包括单一确定性模型、带随机参数的单一模型,以及集合模型。
  • 映射量化的不确定性如何被用于改进下游图任务和可靠性(如主动学习、OOD检测、异常检测)。
  • 在理论不确定性框架与实际图学习应用之间架桥,并提供未来研究方向。

提出的方法

  • 将不确定性量化方法分为三大类:单一确定性模型(直接、基于贝叶斯、基于频率派),带随机参数的单一模型(贝叶斯方法、MC采样),以及集合/其他方法。
  • 讨论不确定性评估指标(预测不确定性、通过互信息的模型/数据不确定性、用于OOD检测的分布不确定性)并指出缺乏通用标准。
  • 描述在图主动学习和自训练中节点选择对不确定性的使用,包括熵、信息密度和多样性度量。
  • 综述节点-、边-和图级别的带不确定性的GNN建模,包括DropEdge、图高斯过程、贝叶斯GNN等机制。
Figure 1: Overall Framework: (1) identifying sources of uncertainty (Section 2 ), (2) quantifying uncertainty (Section 3 ) and (3) utilizing uncertainty for downstream tasks (Section 4 ).
Figure 1: Overall Framework: (1) identifying sources of uncertainty (Section 2 ), (2) quantifying uncertainty (Section 3 ) and (3) utilizing uncertainty for downstream tasks (Section 4 ).

实验结果

研究问题

  • RQ1在图神经网络中存在哪些不同的不确定性来源,它们如何被归类(凭据性、知识性、分布性、空白性、错位感等)?
  • RQ2有哪些方法存在于量化GNN中的不确定性,它们在模型假设和计算成本方面有何不同?
  • RQ3在下游图任务中如何有效利用量化的不确定性(如主动学习、异常检测、OOD检测、鲁棒性)?

主要发现

  • GNNs中的不确定性来自数据随机性、模型训练误差和分布转变,可分为凭据性、知识性和分布来源。
  • 贝叶斯和集合方法提供更丰富的不确定性估计,但计算成本较高;直接方法和频率派方法在效率方面具有不同的校准质量。
  • 不确定性的评估因任务和来源而异,指标包括最大softmax概率、熵、互信息和基于校准的度量,但没有通用标准。
  • 在节点、边和图级别的带不确定性技术(如ConfGCN、DropEdge、UaGGP)提升了鲁棒性、OOD检测和可解释性。
  • 图中的主动学习和自训练受不确定性信号的帮助,通常与代表性和多样性相结合以缓解分布转变。
  • 需要一个系统化框架,在异常检测任务(OOD、离群点、误分类)中选择和应用量化方法。
Figure 2: Bridge uncertainty quantification models and evaluation methods by uncertainty sources. The diamond shape represents the separation of uncertainty sources. Quantification models linked to any diamond indicate their ability to separate the corresponding uncertainty source. We merge "Model U
Figure 2: Bridge uncertainty quantification models and evaluation methods by uncertainty sources. The diamond shape represents the separation of uncertainty sources. Quantification models linked to any diamond indicate their ability to separate the corresponding uncertainty source. We merge "Model U

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。