[论文解读] Layer-diverse Negative Sampling for Graph Neural Networks
引入基于确定性点过程的层差异负采样和空间压缩,以降低跨 GNN 层负样本的冗余性,提升表达能力并缓解过度挤压。
Graph neural networks (GNNs) are a powerful solution for various structure learning applications due to their strong representation capabilities for graph data. However, traditional GNNs, relying on message-passing mechanisms that gather information exclusively from first-order neighbours (known as positive samples), can lead to issues such as over-smoothing and over-squashing. To mitigate these issues, we propose a layer-diverse negative sampling method for message-passing propagation. This method employs a sampling matrix within a determinantal point process, which transforms the candidate set into a space and selectively samples from this space to generate negative samples. To further enhance the diversity of the negative samples during each forward pass, we develop a space-squeezing method to achieve layer-wise diversity in multi-layer GNNs. Experiments on various real-world graph datasets demonstrate the effectiveness of our approach in improving the diversity of negative samples and overall learning performance. Moreover, adding negative samples dynamically changes the graph's topology, thus with the strong potential to improve the expressiveness of GNNs and reduce the risk of over-squashing.
研究动机与目标
- Motivate and address limitations of positive-sample-only GNNs (over-smoothing, limited expressivity, over-squashing).
- Propose layer-diverse negative sampling to enhance sample diversity and graph topology dynamics.
- Develop a computationally practical pipeline combining shortest-path candidate sets, DPP-based sampling, and space squeezing to generate layered negatives.
- Empirically validate LDGCN across diverse real-world graph datasets, showing improved performance and reduced sample redundancy.
提出的方法
- Use a shortest-path-based candidate set S_i to limit negative sampling overhead (Algorithm 1).
- Form a sampling matrix L^i from a DPP to enable diversity-aware negative sampling (Eq. 5).
- Decompose L^i into quality and diversity terms via Quality-Diversity (Eq. 5–7).
- Apply space squeezing to V (the eigen-space) to reduce re-selection of last-layer negatives (Eq. 10, Remarks 3.1–3.2).
- Perform k-DPP sampling on the layer-diverse matrix V' to select final negative samples (Algorithm 2).
- Integrate negative samples into GCN-style message passing as in h_i^l = sum_{j in N_i ∪ {i}} w_{ij} h_j^{(l-1)} − μ sum_{ar{j} in N_ī} w_{iar{j}} h_{ar{j}}^{(l-1)} (Eq. 4).
实验结果
研究问题
- RQ1Do layer-diverse negative samples improve GNN performance compared to baseline negative sampling methods?
- RQ2Does the proposed sampling reduce redundancy across layers while maintaining or improving representational quality?
- RQ3Can layer-diverse negative sampling alleviate over-smoothing and over-squashing in multi-layer GNNs?
- RQ4How does LDGCN perform across various architectures and datasets (homophilous vs heterophilous)?
主要发现
| 数据集 | Citeseer | Cora | PubMed | CS | Computers | Photo | ogbn-arxiv |
|---|---|---|---|---|---|---|---|
| GCN | 55.78_{\u0000a0±\u0000a05.69} | 63.39_{\u0000a0±\n7.92} | 72.24_{\u0000a0±\n4.34} | 54.00_{\u0000a0±\n3.69} | 47.21_{\u0000a0±\n6.22} | 68.04_{\u0000a0±\n6.37} | 70.57_{\u0000a0±\n01.02} |
| GATv2 | 63.67_{\u0000a0±\n07.07} | 74.43_{\u0000a0±\n03.80} | 74.95_{\u0000a0±\n01.71} | 85.00_{\u0000a0±\n01.55} | 61.90_{\\u00a0±\\u00a05.38} | 79.08_{\u0000a0±\n03.43} | 70.60_{\u0000a0±\n08.6} |
| SAGE | 59.70_{\u0000a0±\n08.87} | 73.13_{\u0000a0±\n03.54} | 75.48_{\u0000a0±\n01.94} | 82.22_{\u0000a0±\n02.60} | 59.27_{\u0000a0±\n07.85} | 79.01_{\u0000a0±\n06.54} | 71.15_{\u0000a0±\n01.00} |
| GIN- ε | 60.89_{\u0000a0±\n01.97} | 68.07_{\u0000a0±\n08.87} | 72.93_{\u0000a0±\n05.09} | 59.00_{\u0000a0±\n09.52} | 37.09_{\u0000a0±\n02.21} | 31.56_{\u0000a0±\n06.91} | 35.04_{\u0000a0±\n05.33} |
| AERO | 62.35_{\u0000a0±\n04.88} | 73.37_{\u0000a0±\n06.83} | 72.80_{\u0000a0±\n03.50} | 64.50_{\u0000a0±\n15.70} | 50.20_{\u0000a0±\n10.0} | 56.61_{\u0000a0±\n14.54} | 70.04_{\u0000a0±\n09.1} |
| RGCN | 62.82_{\u0000a0±\n03.84} | 71.75_{\u0000a0±\n03.64} | 74.96_{\u0000a0±\n01.40} | 79.91_{\u0000a0±\n03.50} | 56.44_{\u0000a0±\n09.78} | 75.19_{\u0000a0±\n08.60} | 71.19_{\u0000a0±\n0.42} |
| MCGCN | 50.90_{\u0000a0±\n09.70} | 69.28_{\u0000a0±\n04.33} | 71.44_{\u0000a0±\n04.09} | 80.66_{\u0000a0±\n03.81} | 64.09_{\u0000a0±\n07.27} | 73.01_{\u0000a0±\n09.54} | 65.49_{\u0000a0±\n00.26} |
| PGCN | 63.03_{\u0000a0±\n04.87} | 70.37_{\u0000a0±\n04.51} | 75.47_{\u0000a0±\n01.78} | 52.73_{\u0000a0±\n11.14} | 71.13_{\u0000a0±\n06.27} | 79.26_{\u0000a0±\n06.67} | 66.16_{\u0000a0±\n0.45} |
| D2GCN | 63.30_{\u0000a0±\n02.01} | 73.02_{\u0000a0±\n03.01} | 75.36_{\u0000a0±\n01.82} | 83.47_{\u0000a0±\n02.94} | 74.19_{\u0000a0±\n02.06} | 82.78_{\u0000a0±\n04.23} | 71.46_{\u0000a0±\n02.1} |
| LDGCN | 68.27_±1.29 | 76.80_±1.26 | 77.07_±1.23 | 86.23_±0.55 | 77.92_±2.34 | 86.50_±1.48 | 71.66_±0.30 |
- LDGCN consistently improves accuracy over baseline GCN variants across seven benchmark datasets in multi-layer settings (2–6 layers).
- The layer-diverse sampling reduces cross-layer negative sample overlap, increasing diversity and information coverage.
- Experiments show LDGCN outperforms state-of-the-art negative-sampling baselines (RGCN, MCGCN, PGCN, D2GCN) on several datasets.
- Layer-diverse negatives effectively modify graph topology during learning, with potential to mitigate over-squashing and enhance expressivity.
- LDGCN achieves strong results on both homophilous and heterophilous graphs, demonstrating architectural compatibility (LD-GCN, LD-GATv2, LD-SAGE, LD-GIN).
- Time/complexity is addressed with a shortest-path candidate set to reduce the expensive eigendecomposition step inherent to DPP-based sampling.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。