Skip to main content
QUICK REVIEW

[Paper Review] Dual Co-Matching Network for Multi-choice Reading Comprehension

Shuailiang Zhang, Hai Zhao|arXiv (Cornell University)|Jan 27, 2019
Topic Modeling40 references37 citations
TL;DR

The paper introduces a Dual Co-Matching Network (DMN) that models bidirectional relations among passage, question, and answer with a gating fusion, achieving state-of-the-art results on RACE and ROCStories, and even surpassing human performance on RACE full dataset.

ABSTRACT

Multi-choice reading comprehension is a challenging task that requires complex reasoning procedure. Given passage and question, a correct answer need to be selected from a set of candidate answers. In this paper, we propose extbf{D}ual extbf{C}o- extbf{M}atching extbf{N}etwork ( extbf{DCMN}) which model the relationship among passage, question and answer bidirectionally. Different from existing approaches which only calculate question-aware or option-aware passage representation, we calculate passage-aware question representation and passage-aware answer representation at the same time. To demonstrate the effectiveness of our model, we evaluate our model on a large-scale multiple choice machine reading comprehension dataset (i.e. RACE). Experimental result show that our proposed model achieves new state-of-the-art results.

Motivation & Objective

  • Motivate multi-choice MRC and the need to model all pairwise relations among passage, question, and answer.
  • Propose a bidirectional co-matching framework to capture P-Q, P-A, and Q-A interactions.
  • Incorporate a gating mechanism to fuse bidirectional representations effectively.
  • Leverage BERT as contextual encoder to enhance representation quality.
  • Demonstrate state-of-the-art performance on RACE and ROCStories datasets.

Proposed method

  • Encode passage, question, and candidate answers with BERT to obtain H^p, H^q, H^a.
  • Compute bidirectional matching for each pair (P,Q), (P,A), (Q,A) to obtain S^p, S^a, etc., using attention G^{xy} and equation (1).
  • Fuse bidirectional representations with a gating mechanism to produce M^{p}, M^{a}, and M^{p_a} via Eq. (2).
  • Concatenate M^{p_q}, M^{p_a}, M^{q_a} to form C and compute the final loss with a softmax over candidate answers using Eq. (3).
  • Train with BertAdam optimizer, using dropout and 10-epoch fine-tuning; max sequence length 512.

Experimental results

Research questions

  • RQ1Can a bidirectional, all-pairs matching approach improve accuracy for multi-choice MRC over unidirectional or partial-pair models?
  • RQ2Does a gating-based fusion of bidirectional representations outperform simple concatenation in this setting?
  • RQ3How does integrating a strong encoder like BERT affect performance on large-scale MRC benchmarks (RACE, ROCStories)?

Key findings

  • DMN achieves state-of-the-art on RACE and ROCStories benchmarks.
  • A single DMN model outperforms previous baselines and even surpasses humanTurkers on the full RACE dataset.
  • Bidirectional matching across all P-Q, P-A, and Q-A pairs yields measurable gains over unidirectional matching (e.g., 1.5% on RACE).
  • Gated fusion of bidirectional representations yields better results than concatenation.
  • BERT-based encoder is effectively integrated to boost performance, achieving notable gains across tasks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.