Skip to main content
QUICK REVIEW

[Paper Review] Reinforcement Learning for Relation Classification from Noisy Data

Jun Feng, Minlie Huang|arXiv (Cornell University)|Aug 24, 2018
Text and Document Classification Technologies183 citations
TL;DR

The paper proposes a two-module model (instance selector via reinforcement learning and sentence-level relation classifier) to perform relation classification from noisy distant-supervision data, yielding better sentence-level performance than strong baselines.

ABSTRACT

Existing relation classification methods that rely on distant supervision assume that a bag of sentences mentioning an entity pair are all describing a relation for the entity pair. Such methods, performing classification at the bag level, cannot identify the mapping between a relation and a sentence, and largely suffers from the noisy labeling problem. In this paper, we propose a novel model for relation classification at the sentence level from noisy data. The model has two modules: an instance selector and a relation classifier. The instance selector chooses high-quality sentences with reinforcement learning and feeds the selected sentences into the relation classifier, and the relation classifier makes sentence level prediction and provides rewards to the instance selector. The two modules are trained jointly to optimize the instance selection and relation classification processes. Experiment results show that our model can deal with the noise of data effectively and obtains better performance for relation classification at the sentence level.

Motivation & Objective

  • Address noisy labeling in distant supervision for relation extraction by moving from bag-level to sentence-level prediction.
  • Introduce an instance selector trained with reinforcement learning to filter out noisy sentences before classification.
  • jointly train the instance selector and a CNN-based relation classifier to maximize sentence-level accuracy and robustness to noise.
  • Demonstrate the effectiveness of sentence-level predictions and the ability to filter bags with all-noisy sentences.

Proposed method

  • Formulate instance selection as a reinforcement learning problem with state representation combining the current sentence, selected sentence set, and entity pair.
  • Use a policy network to decide whether to select each sentence, guided by a terminal reward based on the relation classifier’s likelihoods.
  • Adopt a CNN-based relation classifier that predicts p(r|x;Φ) for individual sentences using word and position embeddings.
  • Define a delayed reward at the end of each bag to optimize the quality of selected sentences, and pre-train before joint training.
  • Train using a policy gradient (REINFORCE) with target networks to stabilize learning.
  • Process data by splitting into bags per entity pair and evaluating rewards at bag level, then merge selected sentences for CNN training.

Experimental results

Research questions

  • RQ1Can sentence-level relation classification be effectively learned from noisy distant supervision data?
  • RQ2Does an RL-based instance selector improve the quality of training data for a sentence-level CNN relation classifier?
  • RQ3Is joint training of instance selector and relation classifier more effective than bag-level baselines for this task?
  • RQ4Can the model handle bags where all sentences are noisy and filter them out?
  • RQ5How does the RL-based approach compare to greedy or attention-based instance selection methods?

Key findings

  • CNN+RL outperforms CNN, CNN+Max, and CNN+ATT on sentence-level relation classification.
  • Training on data selected by the RL-based instance selector yields better performance than training on the original noisy data.
  • Sentence-level models outperform bag-level models for sentence-level prediction.
  • The instance selector can filter bags that contain all noisy sentences (high/noise bag filtering capability).
  • Manual inspection shows the selector achieves 74% accuracy on sampled sentences (correct selections and rejections).
  • RL-based selection significantly surpasses greedy selection in this setup.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.