QUICK REVIEW

[论文解读] Wizard of Wikipedia: Knowledge-Powered Conversational agents

Emily Dinan, Stephen Roller|arXiv (Cornell University)|Nov 3, 2018

Topic Modeling参考文献 22被引用 479

一句话总结

本文介绍 Transformer Memory Network 架构，用于从 Wikipedia 知识中检索、阅读并对开域对话进行 grounding，并发布一个用于训练和评估的知识-grounded 对话大型数据集。

ABSTRACT

In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date. The most popular sequence to sequence models typically "generate and hope" generic utterances that can be memorized in the weights of the model when mapping from input utterance(s) to output, rather than employing recalled knowledge as context. Use of knowledge has so far proved difficult, in part because of the lack of a supervised learning benchmark task which exhibits knowledgeable open dialogue with clear grounding. To that end we collect and release a large dataset with conversations directly grounded with knowledge retrieved from Wikipedia. We then design architectures capable of retrieving knowledge, reading and conditioning on it, and finally generating natural responses. Our best performing dialogue models are able to conduct knowledgeable discussions on open-domain topics as evaluated by automatic metrics and human evaluations, while our new benchmark allows for measuring further improvements in this important research direction.

研究动机与目标

激发并研究能够从大型文本资源中检索并对知识进行 grounding 的开放域对话。
创建一个与维基百科相关联的、公开可用的大型知识-grounded 对话数据集。
开发能够检索、阅读并基于检索到的知识进行条件化以生成引人互动的回应的体系结构。
同时评估自动指标和人类评估以评估知识 grounding 及吸引力。

提出的方法

使用信息检索步骤，根据主题和对话历史从 Wikipedia 中获取一小组候选知识段落。
用 Transformer 编码器对知识句子与对话上下文进行编码，并对记忆进行注意以形成上下文感知的表示。
提供基于检索的对话模型和生成型对话模型（Retrieval Transformer Memory Network 与 Generative Transformer Memory Network），用于选择知识并产生回复。
在两阶段变体中，分离知识选择与回复生成组件；在端到端变体中，联合对知识与对话进行编码以生成。
应用知识 dropout 以在知识选择不完美时提升鲁棒性。
在大规模语料库（如 Reddit）上对组件进行预训练，并在需要时对类似 SQuAD 的任务进行微调，以提升检索/ grounding 性能。

实验结果

研究问题

RQ1一个具备知识 grounding 的对话模型是否能够从维基百科段落中有效检索并进行 grounding 以产生吸引人的回答？
RQ2在知识 grounding 和对话质量方面，基于检索的与生成型的 Transformer Memory Network 架构有何比较？
RQ3显式知识监督和知识 dropout 对 grounding 与生成有何影响？
RQ4公开发布的大型 Wizard of Wikipedia 数据集在知识 grounding 的开放域对话中的改进程度如何？
RQ5在已见知识主题与未见主题/知识上的模型表现有何差异？

主要发现

基于检索的模型在知识 grounding 和吸引力方面持续优于基线模型，其中带记忆的 Transformer 在人工评估中取得了强劲的 Recall@1 与 Wiki F1。
以知识为条件的生成模型在有知识时优于缺乏知识的基线；在某些指标上，端到端变体甚至优于两阶段变体。
知识监督和知识 dropout 提升鲁棒性与整体性能，其中两阶段模型从强知识选择模块中受益。
人工评估显示基于检索的模型在吸引力方面得分更高，而带有知识的生成模型在 Wiki F1（与 Wikipedia 的知识重叠）方面得分更高。
Wizard of Wikipedia 数据集（22,311 场对话，201,999 次发言）为知识 grounding 对话系统的稳健训练与评估提供了支持。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。