QUICK REVIEW

[论文解读] Policy-Driven Neural Response Generation for Knowledge-Grounded Dialogue Systems

Behnam Hedayatnia, Karthik Gopalakrishnan|arXiv (Cornell University)|May 26, 2020

Topic Modeling被引用 10

一句话总结

本文提出了一种基于策略的神经响应生成框架，用于知识增强型对话系统，该框架使用包含知识语句、对话行为和话题信息的动作计划来引导响应生成。通过将序列到序列模型条件化于该计划，该方法提高了响应的相关性和可控性，在自动评估和人类评估中，句子级策略均优于对话轮次级和基线模型。

ABSTRACT

Open-domain dialogue systems aim to generate relevant, informative and engaging responses. Seq2seq neural response generation approaches do not have explicit mechanisms to control the content or style of the generated response, and frequently result in uninformative utterances. In this paper, we propose using a dialogue policy to plan the content and style of target responses in the form of an action plan, which includes knowledge sentences related to the dialogue context, targeted dialogue acts, topic information, etc. The attributes within the action plan are obtained by automatically annotating the publicly released Topical-Chat dataset. We condition neural response generators on the action plan which is then realized as target utterances at the turn and sentence levels. We also investigate different dialogue policy models to predict an action plan given the dialogue context. Through automated and human evaluation, we measure the appropriateness of the generated responses and check if the generation models indeed learn to realize the given action plans. We demonstrate that a basic dialogue policy that operates at the sentence level generates better responses in comparison to turn level generation as well as baseline models with no action plan. Additionally the basic dialogue policy has the added effect of controllability.

研究动机与目标

为解决现有序列到序列对话模型在内容和风格控制方面的不足，这些模型常生成信息量不足的响应。
开发一种对话策略，以动作计划的形式规划响应内容和风格，包括知识、对话行为和话题。
将神经响应生成器条件化于动作计划，以在句子级和对话轮次级提升响应质量。
评估模型是否能有效实现给定的动作计划，以及策略设计（句子级 vs. 轮次级）是否影响性能。
通过自动评估和人类评估证明响应的恰当性和可控性得到提升。

提出的方法

标注 Topical-Chat 数据集，提取动作计划的属性，包括知识语句、对话行为和话题信息。
设计一种对话策略模型，从对话上下文中预测动作计划，其变体在句子级和对话轮次级运行。
将序列到序列神经响应生成器条件化于预测的动作计划，以生成上下文相关且风格一致的响应。
将动作计划用作控制机制，在话语和句子层面引导响应生成。
使用自动指标和人类评估训练并评估生成模型，以评估响应质量和计划实现程度。
将句子级和轮次级策略模型的性能与无动作计划的基线模型进行比较。

实验结果

研究问题

RQ1通过动作计划规划内容和风格的对话策略，是否能提升知识增强型对话系统中生成响应的质量？
RQ2将响应生成条件化于结构化的动作计划，是否能产生比基线序列到序列模型更恰当和信息量更丰富的响应？
RQ3句子级策略是否比轮次级策略在引导响应生成和提升响应质量方面更有效？
RQ4神经响应生成器在多大程度上学会了实现动作计划中指定的属性？
RQ5动作计划中包含知识、对话行为和话题信息在多大程度上影响了响应的可控性和相关性？

主要发现

句子级对话策略在响应恰当性方面显著优于轮次级生成模型和无动作计划的基线模型。
引入动作计划可使响应更具信息量且与上下文更相关，该结论得到自动评估和人类评估的证实。
模型表现出更高的可控性，因为生成的响应与动作计划中指定的知识、对话行为和话题更加一致。
人类评估确认，基于动作计划生成的响应被认为比基线模型的响应更自然、更具吸引力。
动作计划机制在句子级和对话轮次级均有效引导了响应生成，且句子级规划效果更优。
所提出的框架成功学习实现了动作计划中的属性，表明模型已内化策略所提供的控制信号。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。