QUICK REVIEW

[论文解读] FlowQA: Grasping Flow in History for Conversational Machine Comprehension

Hsin-Yuan Huang, Eunsol Choi|arXiv (Cornell University)|Oct 6, 2018

Topic Modeling参考文献 22被引用 82

一句话总结

FlowQA 引入 Flow 机制，将中间表示从过去的问题携带到对话式机器阅读理解中，从而在 CoQA 和 QuAC 上实现大幅度 F1 提升。

ABSTRACT

Conversational machine comprehension requires the understanding of the conversation history, such as previous question/answer pairs, the document context, and the current question. To enable traditional, single-turn models to encode the history comprehensively, we introduce Flow, a mechanism that can incorporate intermediate representations generated during the process of answering previous questions, through an alternating parallel processing structure. Compared to approaches that concatenate previous questions/answers as input, Flow integrates the latent semantics of the conversation history more deeply. Our model, FlowQA, shows superior performance on two recently proposed conversational challenges (+7.2% F1 on CoQA and +4.0% on QuAC). The effectiveness of Flow also shows in other tasks. By reducing sequential instruction understanding to conversational machine comprehension, FlowQA outperforms the best models on all three domains in SCONE, with +1.8% to +4.4% improvement in accuracy.

研究动机与目标

推动需要理解对话历史的对话式机器理解研究。
提出 Flow，通过前一步推理步骤的中间表示来编码历史。
将 Flow 与基于单轮的多项选择模型结合，采用交替并行结构以提高效率。
在 CoQA、QuAC 与 SCONE 的顺序指令任务上展示性能提升。

提出的方法

将 Flow 作为一个机制，用于在问题轮之间传递中间上下文表示。
开发 Integration-Flow (IF) 层，轮流在以上下文为驱动的处理与 Flow 驱动的处理之间切换以实现并行化。
使用全知觉注意力和分层问题编码（QHierRNN）来整合历史信息。
在单轮 MC 模型基础上扩展 FlowQA 的推理和答案预测组件。
在 CoQA 与 QuAC 上使用标准的 F1 与 HEQ 指标进行训练和评估；并展示相对于基线的改进。
通过简化为对话式 MC，将 Flow 的适用性展示在 SCONE 的顺序指令理解任务中。

实验结果

研究问题

RQ1我们如何超越简单拼接前一个 QA 对来有效地将历史推理信号整合到对话式机器理解中？
RQ2Flow 对对话式 MC 基准（CoQA、QuAC）以及相关的顺序指令任务的性能影响是什么？
RQ3交替并行的 IF 架构在保持准确性的前提下，是否能提供实际的训练速度提升？
RQ4相对于 FlowQA 的其他组件（如 QHierRNN），Flow 的关键性有多大？

主要发现

FlowQA 在 CoQA 上比先前模型提升了 +7.2 个百分点的 F1，在 QuAC 上提升了 +4.0 个百分点的 F1。
在 CoQA 上，FlowQA 在跨领域上有显著改进，FlowQA（2-Ans 与 All-Ans 变体）对基线表现出强劲提升。
Flow 是一个关键组件，移除 Flow 会在 QuAC 与 CoQA 的表现上造成显著下降（在某些情况下下降 4 点以上）。
交替 IF 架构相比天真 Flow 实现提供了显著的训练速度提升（CoQA 8.1x，QuAC 4.2x）。
Flow 还在 SCONE 的顺序指令领域带来改进，优于以前的最优模型。
在表 1 中，FlowQA（1-Ans）在 CoQA 的总体 F1 达到 75.0，而 BiDAF++（3-ctx）为 67.8，以及较低的基线；人工水平为 88.8。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。