QUICK REVIEW

[论文解读] Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs

Jiong Zhu, Yujun Yan|arXiv (Cornell University)|Jun 19, 2020

Advanced Graph Neural Networks被引用 265

一句话总结

该论文表明 GNNs 在异质性（heterophily）下表现不佳，并提出设计 D1–D3 以及 H2GCN，在异质性下达到高达 40% 的准确率提升，在同质性条件下也具备竞争力。

ABSTRACT

We investigate the representation power of graph neural networks in the semi-supervised node classification task under heterophily or low homophily, i.e., in networks where connected nodes may have different class labels and dissimilar features. Many popular GNNs fail to generalize to this setting, and are even outperformed by models that ignore the graph structure (e.g., multilayer perceptrons). Motivated by this limitation, we identify a set of key designs -- ego- and neighbor-embedding separation, higher-order neighborhoods, and combination of intermediate representations -- that boost learning from the graph structure under heterophily. We combine them into a graph neural network, H2GCN, which we use as the base method to empirically evaluate the effectiveness of the identified designs. Going beyond the traditional benchmarks with strong homophily, our empirical analysis shows that the identified designs increase the accuracy of GNNs by up to 40% and 27% over models without them on synthetic and real networks with heterophily, respectively, and yield competitive performance under homophily.

研究动机与目标

Investigate the representation power of GNNs in semi-supervised node classification under heterophily/low homophily.
Identify design principles to boost learning from graph structure without sacrificing performance under homophily.
Propose a unified model (H2GCN) that adapts to both heterophily and homophily and evaluate its effectiveness on synthetic and real networks.

提出的方法

Identify three key designs for heterophily: ego- and neighbor-embedding separation (D1), higher-order neighborhoods (D2), and combination of intermediate representations (D3).
Theoretically justify each design and integrate them into the H2GCN framework.
Implement H2GCN with feature embedding (S1), two-subneighbor aggregation (N1 and N2) in S2, and concatenation-based final representation (S3).
Evaluate on synthetic and real networks across the spectrum of homophily, including ablation studies to quantify design contributions.
Compare against baseline GNNs and MLP to assess gains in heterophily and parity in homophily.

实验结果

研究问题

RQ1How do GNNs perform on networks with varying levels of homophily/heterophily in semi-supervised node classification?
RQ2Do ego-embedding vs neighbor-embedding separation, higher-order neighborhoods, and combining intermediate representations improve learning under heterophily?
RQ3Can a unified model (H2GCN) adapt to both heterophily and homophily and outperform existing GNNs across datasets?
RQ4What is the empirical impact of each design component (D1–D3) on synthetic and real datasets?

主要发现

Existing GNNs degrade under heterophily, sometimes outperforming graph-agnostic MLPs.
Designs D1–D3 significantly boost learning from graph structure in heterophily, with ablations showing up to 40% gains on synthetic data.
H2GCN, combining D1–D3, achieves strong performance across the spectrum of homophily and outperforms several baselines in heterophily settings.
On real benchmarks with heterophily, models leveraging these designs outperform non-design models by up to 27%.
Higher-order neighborhoods (D2) are especially beneficial under heterophily, while ego-embedding separation (D1) is critical for low homophily; combining intermediate representations (D3) further enhances accuracy.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。