Skip to main content
QUICK REVIEW

[论文解读] S-Net: From Answer Extraction to Answer Generation for Machine Reading Comprehension

Chuanqi Tan, Furu Wei|arXiv (Cornell University)|Jun 15, 2017
Topic Modeling被引用 67
一句话总结

S-Net 提出一个提取-再合成框架,用于 MS-MARCO,先从段落预测证据跨度,然后通过使用证据作为特征的序列到序列合成模型生成最终答案。

ABSTRACT

In this paper, we present a novel approach to machine reading comprehension for the MS-MARCO dataset. Unlike the SQuAD dataset that aims to answer a question with exact text spans in a passage, the MS-MARCO dataset defines the task as answering a question from multiple passages and the words in the answer are not necessary in the passages. We therefore develop an extraction-then-synthesis framework to synthesize answers from extraction results. Specifically, the answer extraction model is first employed to predict the most important sub-spans from the passage as evidence, and the answer synthesis model takes the evidence as additional features along with the question and passage to further elaborate the final answers. We build the answer extraction model with state-of-the-art neural networks for single passage reading comprehension, and propose an additional task of passage ranking to help answer extraction in multiple passages. The answer synthesis model is based on the sequence-to-sequence neural networks with extracted evidences as features. Experiments show that our extraction-then-synthesis method outperforms state-of-the-art methods.

研究动机与目标

  • Motivate the MS-MARCO setting where answers may come from multiple passages and need synthesis beyond exact text spans.
  • Develop an extraction-then-synthesis framework that first extracts evidence spans and then synthesizes final answers.
  • Introduce a multi-task learning approach that includes passage ranking to enhance evidence extraction.
  • Leverage a sequence-to-sequence synthesis model that uses extracted evidences as features for final answer generation.
  • Demonstrate state-of-the-art performance on MS-MARCO compared to pure extraction and several baselines.

提出的方法

  • Use a Bidirectional GRU-based encoding for question and passages with character-level embeddings.
  • Predict evidence snippets with a pointer network that outputs start and end positions over concatenated passages.
  • Implement passage ranking as a multi-task objective to improve evidence extraction.
  • Train an evidence extraction model with joint loss of evidence prediction and passage ranking.
  • Synthesize the final answer with a sequence-to-sequence model that conditions on question, passage, and extracted evidence positions as features.
  • Decode with beam search and apply post-processing to refine generated answers.

实验结果

研究问题

  • RQ1How can MS-MARCO-style answers, which may come from multiple spans and even outside-passage words, be effectively produced?
  • RQ2Does combining evidence extraction with synthesis improve answer quality over pure extraction or end-to-end generation?
  • RQ3Does joint passage ranking improve extraction quality and downstream synthesis?
  • RQ4Can a seq-to-seq generator effectively utilize extracted evidences to produce coherent final answers?

主要发现

  • The extraction-then-synthesis framework outperforms pure extraction baselines and several competing methods on MS-MARCO in ROUGE-L and BLEU-1.
  • An ensemble of extraction models further improves ROUGE-L to 42.92 and BLEU-1 to 44.97 on the test set using extraction alone.
  • The synthesis model with extracted evidences as features yields additional improvements, achieving ROUGE-L of 46.65 and BLEU-1 of 44.78 (S-Net*), near human performance on ROUGE-L.
  • Multi-task learning with passage ranking improves evidence extraction and overall ROUGE-L performance.
  • The approach significantly benefits questions where the answer requires synthesis from multiple evidences or even words from the question.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。