QUICK REVIEW

[论文解读] Zero-Resource Knowledge-Grounded Dialogue Generation

Linxiao Li, Can Xu|arXiv (Cornell University)|Aug 29, 2020

Topic Modeling参考文献 76被引用 50

一句话总结

本文提出 ZRKGC，一种零资源、变分双潜变量模型，结合检索式潜在知识与着陆率对话进行外部知识 grounding，在没有众包知识-grounded 训练数据的情况下实现具有竞争力的结果。

ABSTRACT

While neural conversation models have shown great potentials towards generating informative and engaging responses via introducing external knowledge, learning such a model often requires knowledge-grounded dialogues that are difficult to obtain. To overcome the data challenge and reduce the cost of building a knowledge-grounded dialogue system, we explore the problem under a zero-resource setting by assuming no context-knowledge-response triples are needed for training. To this end, we propose representing the knowledge that bridges a context and a response and the way that the knowledge is expressed as latent variables, and devise a variational approach that can effectively estimate a generation model from a dialogue corpus and a knowledge corpus that are independent with each other. Evaluation results on three benchmarks of knowledge-grounded dialogue generation indicate that our model can achieve comparable performance with state-of-the-art methods that rely on knowledge-grounded dialogues for training, and exhibits a good generalization ability over different topics and different datasets.

研究动机与目标

在训练中不要求上下文-知识-回应三元组的情况下，驱动知识-grounded 对话生成的研究动机。
引入双潜变量框架（潜在知识 Zk 和 grounding rate Za）以连接上下文与回应。
开发带有检索式后验的变分学习方法以对 Zk 进行高效训练。
结合知识选择与互信息损失以提升 grounding 表达能力与稳定性。
在三个基准数据集上展示对主题的泛化能力与跨数据集的适用性。

提出的方法

在概率框架中用两个潜变量 Zk（知识）和 Za（着陆率）来表述 p(R|C,K)。
使用检索式后验 q(Zk|C,R)，从相关性模型检索的前 k 级知识候选中进行选择。
主干生成使用 UNILM 来建模 p(R|C,Zk,Za)。
引入知识选择模型以在模型容量受限的情况下约束输入规模。
加入互信息损失以鼓励 Za 捕捉知识表达。
采用广义期望最大化（E 步使用 q，M 步使用 p）并使用 Gumbel-softmax 实现可微分的标记采样。

实验结果

研究问题

RQ1零资源设置下是否能够在不依赖上下文-知识-回应训练三元组的条件下学习知识-grounded 对话生成？
RQ2双潜变量模型（知识 grounding 与 grounding rate）是否能提升生成质量及对知识使用的可控性？
RQ3与完全生成式后验相比，基于检索的后验学习在此任务中的性能表现如何？
RQ4知识选择与互信息损失对性能与 grounding 可控性的影响是什么？
RQ5与最先进方法相比，ZRKGC 在不同主题与数据集上的泛化能力如何？

主要发现

模型	Wizard 已知 PPL	Wizard 已知 F1	Wizard 未知 PPL	Wizard 未知 F1	Topical 常见 PPL	Topical F1	Topical 稀有 PPL	Topical 稀有 F1	CMU_DoG PPL	CMU_DoG F1
MTASK-RF	65.4	13.1	67.7	12.3	51.3	12.6	51.6	12.5	67.2	10.5
TMN	66.5	15.9	103.6	14.3	30.3	16.5	52.1	14.6	75.2	9.9
ITDD	17.8	16.2	44.8	11.4	21.4	15.8	24.7	14.0	26.0	10.4
SKT	52.0	19.3	81.4	16.1	25.1	17.0	35.6	14.8	41.9	9.6
DRD	19.4	19.3	23.0	17.9	25.9	14.8	28.0	15.1	54.4	10.7
ZRKGC	40.4	18.7	41.5	18.6	44.2	16.6	42.0	16.8	53.5	12.5

ZRKGC 在 Wizard Seen、Wizard Unseen、Topical-Freq、Topical-Rare 及 CMU_DoG 基准上实现有竞争力的 F1 分数，达到了与多种基线相当或更优的水平。
ZRKGC 展现出强泛化能力，Seen 与 Unseen 主题之间的性能下降很小。
在消融实验中，检索式后验学习相比生成式后验变体获得更紧凑的 ELBO 与更高的 F1。
知识选择与互信息损失有助于提升 grounding 表达的可控性与稳定性。
人工评测显示 ZRKGC 的回答比有竞争力的基线更流畅连贯，尽管知识整合仍然具有挑战性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。