Skip to main content
QUICK REVIEW

[Paper Review] Improving Distantly Supervised Relation Extraction using Word and Entity Based Attention

Sharmistha Jat, Siddhesh Khandelwal|arXiv (Cornell University)|Apr 19, 2018
Topic Modeling15 references94 citations
TL;DR

The paper proposes word- and entity-attention models (BGWA and EA) for distantly supervised relation extraction, introduces a new GDS dataset with noise-reduced test data, and shows that an ensemble of BGWA, EA, and PCNN improves precision over baselines.

ABSTRACT

Relation extraction is the problem of classifying the relationship between two entities in a given sentence. Distant Supervision (DS) is a popular technique for developing relation extractors starting with limited supervision. We note that most of the sentences in the distant supervision relation extraction setting are very long and may benefit from word attention for better sentence representation. Our contributions in this paper are threefold. Firstly, we propose two novel word attention models for distantly- supervised relation extraction: (1) a Bi-directional Gated Recurrent Unit (Bi-GRU) based word attention model (BGWA), (2) an entity-centric attention model (EA), and (3) a combination model which combines multiple complementary models using weighted voting method for improved relation extraction. Secondly, we introduce GDS, a new distant supervision dataset for relation extraction. GDS removes test data noise present in all previous distant- supervision benchmark datasets, making credible automatic evaluation possible. Thirdly, through extensive experiments on multiple real-world datasets, we demonstrate the effectiveness of the proposed methods.

Motivation & Objective

  • Motivate improving relation extraction under distant supervision by focusing on relevant sentence context through attention.
  • Develop two novel attention-based models (BGWA and EA) to better capture informative words and entity-related cues.
  • Create a clean, credible evaluation dataset (GDS) by removing test-set noise to enable reliable automatic evaluation.
  • Demonstrate that model ensembling yields superior performance over individual models across datasets.

Proposed method

  • Introduce BGWA: Bi-GRU based word attention that computes word-level relevance to the target relation and applies piecewise max pooling.
  • Introduce EA: Entity-centric attention that weighs words by their relevance to each entity and uses PCNN with entity-attention pooling.
  • Combine BGWA, EA, and PCNN through a weighted voting ensemble with weights learned via linear regression on a development set.
  • Construct Google Distant Supervision (GDS), a dataset designed to reduce test-set noise by ensuring at least one sentence in each instance set expresses the assigned relation.
  • Evaluate models on two datasets (Riedel2010-b and GDS) using precision-recall curves and AUC for model selection.

Experimental results

Research questions

  • RQ1Can word-level attention (BGWA) improve distantly supervised relation extraction by highlighting informative phrases?
  • RQ2Can entity-centric attention (EA) improve relation extraction by focusing on entity-related context?
  • RQ3Does combining multiple complementary models via a weighted ensemble outperform individual models in distantly supervised RE?
  • RQ4Does a cleaned GDS dataset provide more reliable automatic evaluation for distantly supervised RE?
  • RQ5What is the relative performance of BGWA and EA across datasets with differing noise and relation-set sizes?

Key findings

  • BGWA and EA achieve higher or competitive precision across recall ranges compared to state-of-the-art baselines on both datasets.
  • The ensemble of BGWA, EA, and PCNN yields further precision gains over the individual models (notably over 2-3% on the Riedel2010-b dataset across recall ranges).
  • BGWA tends to perform better on Riedel2010-b, while EA performs better on GDS, indicating complementary strengths.
  • The attention models help identify key words and entity-related cues that align with the target relations, as illustrated by attention visualizations.
  • GDS provides credible automatic evaluation by ensuring each instance set contains at least one sentence expressing the assigned relation, mitigating test-set noise.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.