QUICK REVIEW

[论文解读] Description-Driven Task-Oriented Dialog Modeling

Jeffrey Zhao, Raghav Gupta|arXiv (Cornell University)|Jan 21, 2022

Speech and dialogue systems被引用 33

一句话总结

本文提出了 D3ST，一种基于描述的对话状态跟踪模型，利用带有索引选择机制的自然语言模式描述，实现数据高效、零-shot 转移，在 MultiWOZ、SGD 和 SGD-X 基准上表现出色。

ABSTRACT

Task-oriented dialogue (TOD) systems are required to identify key information from conversations for the completion of given tasks. Such information is conventionally specified in terms of intents and slots contained in task-specific ontology or schemata. Since these schemata are designed by system developers, the naming convention for slots and intents is not uniform across tasks, and may not convey their semantics effectively. This can lead to models memorizing arbitrary patterns in data, resulting in suboptimal performance and generalization. In this paper, we propose that schemata should be modified by replacing names or notations entirely with natural language descriptions. We show that a language description-driven system exhibits better understanding of task specifications, higher performance on state tracking, improved data efficiency, and effective zero-shot transfer to unseen tasks. Following this paradigm, we present a simple yet effective Description-Driven Dialog State Tracking (D3ST) model, which relies purely on schema descriptions and an "index-picking" mechanism. We demonstrate the superiority in quality, data efficiency and robustness of our approach as measured on the MultiWOZ (Budzianowski et al.,2018), SGD (Rastogi et al., 2020), and the recent SGD-X (Lee et al., 2021) benchmarks.

研究动机与目标

用自然语言描述替代传统的槽位/意图缩写，以提升 TOD 架构的语义和泛化能力。
提出一个简单、有效的 DST 模型（D3ST），仅依赖模式描述。
开发一个索引选择机制，以识别活动模式元素而不需要记忆任意记号。
在标准 TOD 基准上展示更优性能、数据效率和零-shot 转移。
展示语言描述相对于缩写在跨数据集和任务中的鲁棒性和效率优势。

提出的方法

将 seq2seq 模型（T5 变体）用作对话状态跟踪的骨干。
在输入前加上槽位和意图描述的串联（描述在每个示例中随机重新索引，以防记忆化）。
将输出表示为活动槽/意图的索引（及其值），实现对所有活动元素的单次解码。
将分类槽值与其槽描述一起列举，以提高分类预测的准确性。
可选地用槽特定索引约束值，以减少跨槽的歧义。
证明自然语言描述在数据效率和零-shot 迁移方面优于缩写描述。

实验结果

研究问题

RQ1在标准 TOD 基准（MultiWOZ 和 SGD）上使用全量数据进行训练时，D3ST 的表现如何？
RQ2模式描述的类型（自然语言、缩写、随机字符串）如何影响模型质量和泛化？
RQ3在低资源和零-shot 情况下，D3ST 的数据效率有多高，不同描述类型如何影响效率？
RQ4D3ST 对描述措辞的变化是否鲁棒（SGD-X 鲁棒性），描述丰富度如何影响鲁棒性？

主要发现

D3ST 在 MultiWOZ 与 SGD 的不同模型规模（Base/Large/XXL）下接近最先进的结果。
语言描述在所有评估设置中优于缩写和随机字符串，提高泛化和零-shot 转移。
D3ST 实现了卓越的数据效率，XXL 模型在 SGD 上仅需 0.18% 的训练数据就达到显著性能，1% 数据时接近全量性能。
该模型支持对未见领域和任务的零-shot 转移，随着模型规模的增大，跨领域和跨数据集的泛化能力增强。
SGD-X 鲁棒性实验表明，语言描述比其他描述类型获得更高的平均准确性和更低的模式敏感度（SS(JGA)）。
输出解码更高效，因为所有活动槽/意图在单次前向中预测，避免逐槽解码。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。