QUICK REVIEW

[Paper Review] Pointer Graph Networks

Petar Veličković, Lars Buesing|arXiv (Cornell University)|Jun 11, 2020

Advanced Graph Neural Networks16 citations

TL;DR

Pointer Graph Networks (PGNs) introduce a differentiable, supervised method for learning dynamic, sparse graph structures by enabling nodes to point to one another, thereby augmenting static input graphs with learnable pointer edges. This allows PGNs to model complex, pointer-based data structures like disjoint set unions and link/cut trees, achieving 5× out-of-distribution generalization on dynamic graph connectivity tasks, surpassing standard GNNs and Deep Sets.

ABSTRACT

Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due to the vast search space of possible graphs. Here we introduce Pointer Graph Networks (PGNs) which augment sets or graphs with additional inferred edges for improved model generalisation ability. PGNs allow each node to dynamically point to another node, followed by message passing over these pointers. The sparsity of this adaptable graph structure makes learning tractable while still being sufficiently expressive to simulate complex algorithms. Critically, the pointing mechanism is directly supervised to model long-term sequences of operations on classical data structures, incorporating useful structural inductive biases from theoretical computer science. Qualitatively, we demonstrate that PGNs can learn parallelisable variants of pointer-based data structures, namely disjoint set unions and link/cut trees. PGNs generalise out-of-distribution to 5x larger test inputs on dynamic graph connectivity tasks, outperforming unrestricted GNNs and Deep Sets.

Motivation & Objective

To address the limitation of static graph structures in GNNs by enabling dynamic, data-driven graph topology learning.
To improve generalization in algorithmic reasoning tasks by incorporating inductive biases from classical data structures.
To demonstrate that neural networks can learn and generalize complex, pointer-based algorithms such as disjoint set unions and link/cut trees.
To provide a supervised, sparse, and efficient method for latent graph structure inference that improves model expressiveness without sacrificing computational efficiency.

Proposed method

PGNs use a hybrid architecture combining encoder, processor, and decoder networks, with step-wise message passing over dynamically learned pointer edges.
At each time step, each node predicts a pointer to another node via a differentiable routing mechanism, forming a symmetric pointer adjacency matrix Π(t).
The processor network P uses the pointer matrix Π(t−1) as relational inductive bias to update node representations through message passing.
A masking mechanism identifies which nodes are modified at each step, allowing the model to focus computation on relevant entities.
The model is trained with direct supervision on intermediate data structure states (e.g., DSU or LCT configurations), enabling precise alignment with classical algorithmic behavior.
The decoder aggregates node representations via permutation-invariant readouts to produce predictions for set-level queries.

Experimental results

Research questions

RQ1Can neural networks learn and generalize complex, pointer-based data structures such as disjoint set unions and link/cut trees?
RQ2Does direct supervision on intermediate data structure states improve generalization beyond static GNNs and Deep Sets?
RQ3Can a differentiable, sparse, and dynamic graph structure be learned effectively for algorithmic reasoning tasks?
RQ4To what extent does the use of inductive biases from theoretical computer science enhance model performance and generalization?
RQ5Can PGNs generalize to input sizes significantly larger than seen during training?

Key findings

PGNs achieved an F1 score of 0.616 ± 0.009 on the largest link/cut tree test set (n = 100, ops = 150), outperforming all baselines including GNN (0.401 ± 0.123) and SupGNN (0.541 ± 0.059).
On dynamic graph connectivity tasks, PGNs generalized to 5× larger test inputs than seen during training, demonstrating strong out-of-distribution generalization.
The ablation study showed that PGN-MO, which uses only mask supervision, still outperformed all non-PGN models, indicating that inductive biases are key to performance.
PGN-Asym, which uses asymmetric pointers, performed significantly worse than the symmetric PGN, confirming the empirical benefit of symmetrizing pointers to prevent structural disconnection.
When scaled to n = 200 and ops = 300, the PGN achieved an F1 score of 0.636 ± 0.009, approaching the Oracle-Ptr baseline of 0.619 ± 0.043, demonstrating robustness to larger inputs.
The model successfully learned to produce valid, parallelizable data structures that deviate from ground-truth implementations while maintaining correctness.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.