[论文解读] InfoGCL: Information-Aware Graph Contrastive Learning
InfoGCL 提供一个信息理论框架用于图对比学习,解耦视图增强、视图编码和表示对比,统一了先前方法并在图和节点分类任务上取得强表现。
Various graph contrastive learning models have been proposed to improve the performance of learning tasks on graph datasets in recent years. While effective and prevalent, these models are usually carefully customized. In particular, although all recent researches create two contrastive views, they differ greatly in view augmentations, architectures, and objectives. It remains an open question how to build your graph contrastive learning model from scratch for particular graph learning tasks and datasets. In this work, we aim to fill this gap by studying how graph information is transformed and transferred during the contrastive learning process and proposing an information-aware graph contrastive learning framework called InfoGCL. The key point of this framework is to follow the Information Bottleneck principle to reduce the mutual information between contrastive parts while keeping task-relevant information intact at both the levels of the individual module and the entire framework so that the information loss during graph representation learning can be minimized. We show for the first time that all recent graph contrastive learning methods can be unified by our framework. We empirically validate our theoretical analysis on both node and graph classification benchmark datasets, and demonstrate that our algorithm significantly outperforms the state-of-the-arts.
研究动机与目标
- 在对比学习的过程中,激励在图中信息如何被转化和传递。
- 提出一个信息理论框架(InfoGCL),在模块和框架层面尽量减少信息丢失,同时保留与任务相关的信息。
- 在一个共同原则下统一现有的图对比学习方法,并提供模块设计的实际指南。
- 分析负样本在图对比学习中的作用,并评估何时负样本有益。
提出的方法
- 将图对比学习分解为三个阶段:视图增强、视图编码和表示对比。
- 以信息瓶颈原理为基础,为视图、编码器和对比模式推导最优性推论。
- 定义最优的增强视图(在尽量减小 I(v_i; v_j) 的同时保持 I(v_i; y) = I(v_j; y) = I(G; y))。
- 定义最优视图编码器(使 I(f_i(v_i); v_i) 最小化,前提是 I(f_i(v_i); v_j) = I(v_i; v_j))。
- 定义最优对比模式(选择 c_i, c_j,以通过基于互信息的准则最大化保留的任务相关信息)。
- 评估一系列图视图增强(节点丢弃、边扰动、属性掩码、子图抽样)和对比模式(全局-全局、局部-全局、局部-局部、多尺度、混合)。
- 通过采用无负样本的 SimSiam 风格损失并与包含负样本的变体进行比较,研究是否需要负样本。
实验结果
研究问题
- RQ1应如何选择增强视图,以在最大化任务相关信息的同时,最小化共享的非任务信息?
- RQ2哪些编码器在编码后能最有效地保留共享的、对任务相关的信息?
- RQ3在给定最优视图与编码器的条件下,哪种对比模式最能保留下游任务信息?
- RQ4在跨任务和数据集的图对比学习中,负样本是否对性能产生实质性影响?
主要发现
| 方法 | MUTAG | PTC-MR | IMDB-B | IMDB-M | NCI1 | COLLAB |
|---|---|---|---|---|---|---|
| 核方法 – SP | 85.2 ± 2.4 | 58.2 ± 2.4 | 55.6 ± 0.2 | 38.0 ± 0.3 | 73.5 ± 0.1 | - |
| 核方法 – GK | 81.7 ± 2.1 | 57.3 ± 1.4 | 65.9 ± 1.0 | 43.9 ± 0.4 | 66.0 ± 0.1 | 72.8 ± 0.3 |
| 核方法 – WL | 80.7 ± 3.0 | 58.0 ± 0.5 | 72.3 ± 3.4 | 47.0 ± 0.5 | 80.0 ± 0.5 | 78.9 ± 1.9 |
| 核方法 – DGK | 87.4 ± 2.7 | 60.1 ± 2.6 | 67.0 ± 0.6 | 44.6 ± 0.5 | 80.3 ± 0.5 | 73.1 ± 0.3 |
| 核方法 – MLG | 87.9 ± 1.6 | 63.3 ± 1.5 | 66.6 ± 0.3 | 41.2 ± 0.0 | 80.8 ± 1.3 | - |
| 监督方法 – GraphSAGE | 85.1 ± 7.6 | 63.9 ± 7.7 | 72.3 ± 5.3 | 50.9 ± 2.2 | 77.7 ± 1.5 | 68.3 ± 4.2 |
| 监督方法 – GCN | 85.6 ± 5.8 | 64.2 ± 4.3 | 74.0 ± 3.4 | 51.9 ± 3.8 | 80.2 ± 2.0 | 79.0 ± 1.8 |
| 监督方法 – GIN-0 | 89.4 ± 5.6 | 64.6 ± 7.0 | 75.1 ± 5.1 | 52.3 ± 2.8 | 82.7 ± 1.7 | 80.2 ± 1.9 |
| 监督方法 – GIN-e | 89.0 ± 6.0 | 63.7 ± 8.2 | 74.3 ± 5.1 | 52.1 ± 3.6 | 82.7 ± 1.6 | 80.1 ± 1.9 |
| 监督方法 – GAT | 89.4 ± 6.1 | 66.7 ± 5.1 | 70.5 ± 2.3 | 47.8 ± 3.1 | 66.6 ± 2.2 | 67.4 ± 2.9 |
| 无监督方法 – RandomWalk | 83.7 ± 1.5 | 57.9 ± 1.3 | 50.7 ± 0.3 | 34.7 ± 0.2 | 64.3 ± 0.3 | - |
| 无监督方法 – node2vec | 72.6 ± 10.2 | 58.6 ± 8.0 | 50.2 ± 0.9 | 36.0 ± 0.7 | 54.9 ± 1.6 | 56.1 ± 0.2 |
| 无监督方法 – sub2vec | 61.1 ± 15.8 | 60.0 ± 6.4 | 55.3 ± 1.5 | 36.7 ± 0.8 | 52.8 ± 1.5 | - |
| 无监督方法 – graph2vec | 83.2 ± 9.6 | 60.2 ± 6.9 | 71.1 ± 0.5 | 50.4 ± 0.9 | 75.4 ± 1.2 | - |
| 无监督方法 – InfoGraph | 89.0 ± 1.1 | 61.7 ± 1.4 | 73.0 ± 0.9 | 49.7 ± 0.5 | 76.2 ± 1.4 | 70.7 ± 1.1 |
| 无监督方法 – GraphCL | 86.8 ± 1.3 | 61.3 ± 2.1 | 71.1 ± 0.4 | 49.2 ± 0.6 | 77.9 ± 0.4 | 71.4 ± 1.2 |
| 无监督方法 – mvgrl | 89.7 ± 1.1 | 62.5 ± 1.7 | 74.2 ± 0.7 | 51.2 ± 0.5 | 77.0 ± 0.8 | 76.0 ± 1.2 |
| InfoGCL(报道) | 91.2 ± 1.3 | 63.5 ± 1.5 | 75.1 ± 0.9 | 51.4 ± 0.8 | 80.2 ± 0.6 | 80.0 ± 1.3 |
- InfoGCL 在节点和图分类基准测试上达到与最先进方法相竞争的结果。
- 在图分类任务中,InfoGCL 与领先的无监督方法相匹配或超越某些有监督方法,例如在 MUTAG 上达到 91.2%,在 Citeseer 上 63.5%,在 Pubmed 上 75.1%(来自表2的示例)。
- InfoGCL 在多个数据集上优于多数基线,在图分类任务上有显著提升(例如在某些设置下相对提升约 5.2%)。
- 负样本并非普遍必需;在若干图数据集上移除负样本可保持性能,但在节点级任务上可能略有下降,特别是在像 Cora/Citeseer/Pubmed 这样的稀疏图上(表4)。
- 存在一个统一的视角:大多数最新的图对比学习方法都可以被解释为在信息瓶颈指导下的 InfoGCL 三阶段框架的实例。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。