QUICK REVIEW

[Paper Review] DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Wenhan Xiong, Thien Hoang|arXiv (Cornell University)|Jul 20, 2017

Advanced Graph Neural Networks23 references108 citations

TL;DR

Introduces a policy-based reinforcement learning framework (DeepPath) to learn multi-hop relational paths in large knowledge graphs, guided by a reward function balancing accuracy, diversity, and efficiency. It outperforms PRA and KG embedding methods on Freebase (FB15K-237) and NELL datasets.

ABSTRACT

We study the problem of learning to reason in large scale knowledge graphs (KGs). More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector space by sampling the most promising relation to extend its path. In contrast to prior work, our approach includes a reward function that takes the accuracy, diversity, and efficiency into consideration. Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge graph embedding methods on Freebase and Never-Ending Language Learning datasets.

Motivation & Objective

Motivate multi-hop reasoning in large knowledge graphs and address limitations of discrete-path methods like PRA.
Propose a policy-based RL agent operating in a continuous embedding space to discover informative relational paths.
Design a reward function that jointly optimizes accuracy, diversity, and efficiency of discovered paths.
Demonstrate scalability and empirical superiority over PRA and embedding methods on benchmark KG datasets.

Proposed method

Model the KG reasoning task as an MDP with continuous state representations derived from TransE-style embeddings.
Use a policy network to output a probability over all relations as actions at each step.
Train the policy with REINFORCE and a supervised pre-training phase inspired by imitation learning (randomized BFS paths).
Incorporate a reward function combining global accuracy (+1 if target reached, -1 otherwise), path length-based efficiency (1/length), and diversity (-average cosine similarity with past paths).
Employ a bi-directional path-constrained search to verify learned reasoning formulas efficiently during evaluation.
Apply Adam optimization with L2 regularization for policy updates.

Experimental results

Research questions

RQ1Can reinforcement learning over a KG embedding space learn reliable multi-hop reasoning paths?
RQ2Does a reward function balancing accuracy, diversity, and efficiency improve path quality and learning efficiency compared to prior path-based methods?
RQ3How does the RL-based DeepPath compare to PRA and KG embedding methods on standard KG datasets in link and fact prediction tasks?
RQ4Do supervised pre-training and path verification via bi-directional search aid scalability and performance on large KGs?
RQ5Are the discovered RL paths shorter and more diverse than those produced by traditional path-ranking or embedding approaches?

Key findings

The RL-based DeepPath outperforms PRA and embedding methods on FB15K-237 and NELL-995 for link prediction, as measured by MAP.
DeepPath discovers significantly fewer but more predictive reasoning paths than PRA (e.g., average paths per task substantially reduced).
A combination of global accuracy, efficiency, and diversity in the reward yields better qualitative and quantitative path quality.
Bi-directional path verification reduces search complexity and improves robustness when evaluating learned paths.
Supervised pre-training substantially aids RL convergence in large action spaces and improves early success rates (succ_10) during training.
On fact prediction tasks, DeepPath generally outperforms embedding baselines across most relations/datasets.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.