QUICK REVIEW

[论文解读] Variational Graph Auto-Encoders

Thomas Kipf, Max Welling|arXiv (Cornell University)|Nov 21, 2016

Advanced Graph Neural Networks参考文献 8被引用 897

一句话总结

本文介绍 Variational Graph Auto-Encoders (VGAE)，一种在图结构数据上进行无监督学习的概率框架，使用图卷积编码器和内积解码器进行连边预测，特征提升性能。

ABSTRACT

We introduce the variational graph auto-encoder (VGAE), a framework for unsupervised learning on graph-structured data based on the variational auto-encoder (VAE). This model makes use of latent variables and is capable of learning interpretable latent representations for undirected graphs. We demonstrate this model using a graph convolutional network (GCN) encoder and a simple inner product decoder. Our model achieves competitive results on a link prediction task in citation networks. In contrast to most existing models for unsupervised learning on graph-structured data and link prediction, our model can naturally incorporate node features, which significantly improves predictive performance on a number of benchmark datasets.

研究动机与目标

Develop a probabilistic latent variable model for unsupervised learning on undirected graphs.
Leverage a two-layer GCN to parameterize the variational posterior over latent node embeddings.
Train via a variational lower bound to learn meaningful latent representations for graphs.
Demonstrate improved link prediction performance, especially when node features are available.
Compare with baseline graph embedding methods and discuss effects of feature use and priors.

提出的方法

Define a VGAE with latent per-node z_i and a Gaussian posterior q(z_i|X,A) parameterized by a two-layer GCN.
Use a generative model p(A|Z) where A_ij|z_i,z_j ~ Bernoulli( sigmoid(z_i^T z_j) ).
Optimize the variational lower bound L = E_{q(Z|X,A)}[log p(A|Z)] - KL[q(Z|X,A)||p(Z)], with p(Z)=N(0,I).
Train with the reparameterization trick and full-batch gradient descent.
Provide a non-probabilistic GAE variant using Z Z^T as the reconstruction for A via sigmoid.
Experiment with featureful and featureless settings (X used vs identity).
Compare VGAE/GAE against spectral clustering and DeepWalk baselines on link prediction.

实验结果

研究问题

RQ1Can a variational approach learn meaningful latent embeddings for nodes in a graph in an unsupervised manner?
RQ2Does incorporating node features X improve link prediction performance over featureless variants?
RQ3How does VGAE/GAE compare to established baselines (spectral clustering, DeepWalk) on citation networks?
RQ4What is the impact of using probabilistic priors (Gaussian Z) with an inner-product decoder on performance?

主要发现

方法	Cora AUC	Cora AP	Citeseer AUC	Citeseer AP	Pubmed AUC	Pubmed AP
SC tang2011leveraging	84.6±0.01	88.5±0.00	80.5±0.01	85.0±0.01	84.2±0.02	87.8±0.01
DW perozzi2014deepwalk	83.1±0.01	85.0±0.00	80.5±0.02	83.6±0.01	84.4±0.00	84.1±0.00
GAE*	84.3±0.02	88.1±0.01	78.7±0.02	84.1±0.02	82.2±0.01	87.4±0.00
VGAE*	84.0±0.02	87.7±0.01	78.9±0.03	84.1±0.02	82.7±0.01	87.5±0.01
GAE	91.0±0.02	92.0±0.03	89.5±0.04	89.9±0.05	96.4±0.00	96.5±0.00
VGAE	91.4±0.01	92.6±0.01	90.8±0.02	92.0±0.02	94.4±0.02	94.7±0.02

VGAE and GAE achieve competitive results on link prediction in citation networks.
Incorporating node features significantly improves predictive performance across datasets.
Featureless variants (GAE*, VGAE*) perform reasonably but generally underperform feature-equipped models.
GAE and VGAE with features outperform baselines on Cora, Citeseer, and Pubmed datasets in most metrics.
A Gaussian prior may be suboptimal with an inner-product decoder, suggesting room for better priors or models.
The models are trained with full-batch gradient descent and the reparameterization trick; future work includes scalability improvements.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。