QUICK REVIEW

[Paper Review] NodePiece: Compositional and Parameter-Efficient Representations of Large Knowledge Graphs

Mikhail Galkin, Étienne Denis|arXiv (Cornell University)|Jun 22, 2021

Advanced Graph Neural Networks23 citations

TL;DR

NodePiece proposes a parameter-efficient, anchor-based method for learning compositional node embeddings in large knowledge graphs by representing entities as sequences of subword-like units (anchors and relations), enabling inductive representation learning with significantly fewer parameters. It achieves competitive performance on link prediction, node classification, and relation prediction while using less than 10% of nodes as anchors and reducing parameters by up to 70x compared to standard models.

ABSTRACT

Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector. Such a shallow lookup results in a linear growth of memory consumption for storing the embedding matrix and incurs high computational costs when working with real-world KGs. Drawing parallels with subword tokenization commonly used in NLP, we explore the landscape of more parameter-efficient node embedding strategies with possibly sublinear memory requirements. To this end, we propose NodePiece, an anchor-based approach to learn a fixed-size entity vocabulary. In NodePiece, a vocabulary of subword/sub-entity units is constructed from anchor nodes in a graph with known relation types. Given such a fixed-size vocabulary, it is possible to bootstrap an encoding and embedding for any entity, including those unseen during training. Experiments show that NodePiece performs competitively in node classification, link prediction, and relation prediction tasks while retaining less than 10% of explicit nodes in a graph as anchors and often having 10x fewer parameters. To this end, we show that a NodePiece-enabled model outperforms existing shallow models on a large OGB WikiKG 2 graph having 70x fewer parameters.

Motivation & Objective

To address the high memory and computational cost of conventional knowledge graph embedding models that scale linearly with the number of entities.
To enable inductive representation learning for unseen entities at inference time, overcoming limitations of standard lookup-based embedding methods.
To draw inspiration from subword tokenization in NLP to create a fixed-size, parameter-efficient vocabulary for large-scale knowledge graphs.
To reduce the parameter budget by replacing entity-specific embeddings with a composition of fixed-size atomic units (anchors and relations).
To enable scalable and generalizable representation learning on large, real-world knowledge graphs such as Wikidata and OGB WikiKG2.

Proposed method

NodePiece constructs a fixed-size vocabulary of anchor nodes and relation types, where each entity is encoded as a sequence of its k nearest anchors and m surrounding relations.
It uses a hashing mechanism to map each node to a unique sequence of anchor and relation tokens, enabling compositional representation via a learnable encoder function.
The encoder function, such as an MLP or Transformer, maps the token sequence to a d-dimensional embedding, with the overall parameter budget determined by the vocabulary size and encoder complexity.
The method supports inductive learning by allowing new, unseen entities to be embedded using the same fixed vocabulary and encoder, without retraining.
It leverages relational context and anchor distances to improve hash uniqueness and representation diversity, reducing collision risks.
The approach is compatible with any downstream model, such as RotatE or CompGCN, and can be trained end-to-end with standard link prediction and node classification objectives.

Experimental results

Research questions

RQ1Can a fixed-size, anchor-based vocabulary of subunit entities enable parameter-efficient and generalizable representation learning in large knowledge graphs?
RQ2To what extent can compositional node representations based on anchors and relations outperform standard lookup-based embeddings in terms of parameter efficiency and performance?
RQ3How well does NodePiece generalize to unseen entities during inductive inference, especially in large-scale knowledge graphs?
RQ4What is the impact of incorporating relational context and anchor distances on the uniqueness and quality of node representations?
RQ5Can NodePiece achieve competitive performance on link prediction, node classification, and relation prediction tasks while reducing model parameters by orders of magnitude?

Key findings

NodePiece achieves competitive performance on node classification, link prediction, and relation prediction tasks, even when using only 1,000 anchors and 500 relation types.
On the OGB WikiKG2 dataset, NodePiece with 10k anchors and 74 relations achieved a Hits@10 of 0.997 on link prediction, outperforming standard models with 78M parameters.
The method reduces parameter count by up to 70x compared to standard models like PyTorch-BigGraph, which uses a 78M × 200 embedding matrix.
NodePiece with a 1k-anchored vocabulary and simple MLP encoder achieved a Hits@10 of 0.971 on FB15k-237, comparable to models with 15k entity embeddings.
The model generalizes well to inductive settings: even without anchor nodes, the vocabulary size remains independent of graph size, and performance remains strong on dense, relation-rich graphs.
Ablation studies show that removing relational context or anchor distances leads to performance drops, confirming their importance in improving representation uniqueness and quality.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.