QUICK REVIEW

[Paper Review] Learning Dynamic Knowledge Graphs to Generalize on Text-Based Games.

Ashutosh Adhikari, Xingdi Yuan|arXiv (Cornell University)|Feb 21, 2020

Topic Modeling19 citations

TL;DR

This paper proposes GATA, a graph-aided transformer agent that learns dynamic knowledge graphs end-to-end from raw text to improve planning and generalization in text-based games. By combining reinforcement and self-supervised learning, GATA outperforms text-only baselines by an average of 24.2% across 500+ TextWorld games, demonstrating superior policy convergence and generalization.

ABSTRACT

Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text. We propose a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics. GATA is trained using a combination of reinforcement and self-supervised learning. Our work demonstrates that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations. Experiments on 500+ unique games from the TextWorld suite show that our best agent outperforms text-based baselines by an average of 24.2%.

Motivation & Objective

To overcome the limitations of hand-crafted representations and heuristics in text-based game agents.
To enable effective sequential decision-making and generalization across diverse game configurations.
To learn structured, dynamic knowledge graphs end-to-end from raw textual game descriptions.
To improve policy learning and planning performance through graph-structured belief representations.

Proposed method

GATA employs a graph-aided transformer architecture that infers and updates latent belief graphs during planning.
The agent uses self-supervised learning to pre-train on raw text sequences to build initial graph structures.
Reinforcement learning fine-tunes the agent on game-specific rewards, updating the graph based on observed transitions.
The belief graph captures entity relationships and game state dynamics, enabling better action selection.
Graph updates are differentiable, allowing end-to-end training via policy gradients.
The model integrates attention mechanisms over both text tokens and graph nodes to enhance contextual reasoning.

Experimental results

Research questions

RQ1Can end-to-end learned dynamic knowledge graphs improve policy learning in text-based games?
RQ2How does graph-structured representation enhance generalization across unseen game configurations?
RQ3To what extent does combining self-supervised and reinforcement learning improve agent performance compared to text-only baselines?
RQ4Can the agent maintain effective planning under dynamic and complex game environments using latent graphs?

Key findings

GATA outperforms text-only baselines by an average of 24.2% across 500+ games in the TextWorld suite.
The learned graph representations enable faster convergence to high-performing policies compared to text-only models.
Generalization across unseen game configurations is significantly improved due to structured, dynamic knowledge encoding.
Self-supervised pre-training on raw text enhances downstream reinforcement learning performance.
The dynamic graph updates allow the agent to adaptively model evolving game states and relationships.
The graph-aided approach leads to more robust and interpretable decision-making in complex text-based environments.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.