QUICK REVIEW

[论文解读] EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Rohola Zandie, Mohammad H. Mahoor|arXiv (Cornell University)|Mar 5, 2020

Topic Modeling参考文献 25被引用 42

一句话总结

EmpTransfo 引入一个多头 Transformer 对话模型，通过多任务学习将情感、话题和行动上下文整合，以生成富有同理心且连贯的回复，在 DailyDialog 的 Hit@1 和 perplexity 上超过基线。

ABSTRACT

Understanding emotions and responding accordingly is one of the biggest challenges of dialog systems. This paper presents EmpTransfo, a multi-head Transformer architecture for creating an empathetic dialog system. EmpTransfo utilizes state-of-the-art pre-trained models (e.g., OpenAI-GPT) for language generation, though models with different sizes can be used. We show that utilizing the history of emotions and other metadata can improve the quality of generated conversations by the dialog system. Our experimental results using a challenging language corpus show that the proposed approach outperforms other models in terms of Hit@1 and PPL (Perplexity).

研究动机与目标

推动构建能够以合适情感回答的同理心对话代理。
研究显式上下文信号（情感、话题、行动）如何影响回复质量。
Develop a scalable architecture compatible with different pre-trained language models.
展示多任务学习在同理心质量和生成性能上的双重提升。

提出的方法

提出 EmpTransfo，这是一个具有三个并行预测头的 12 层 Transformer 解码器，用于预测下一个情感、下一个话语和下一个标记。
使用 OpenAI GPT 预训练权重，并在 DailyDialog 上对情感、行动和话题嵌入进行微调。
使用多任务目标 L_total = c1 L1 + c2 L2 + c3 L3 进行训练，其中每个 L 对应语言建模、下一个话语预测和下一个情感预测。
用标记、情感、行动嵌入以及话题嵌入拼接表示输入，形成模型输入。
在解码时引入核采样（top-p，p=0.9，T=0.7），以在创造性和可靠性之间取得平衡。
在 DailyDialog 的评估集上使用 Hit@1、困惑度 (PPL)、F1 和 BLEU 进行评估。

实验结果

研究问题

RQ1将情感、话题和行动上下文纳入是否会影响同理心回复的生成？
RQ2带辅助任务的多头 Transformer 是否能够同时提升回复质量和情感预测准确性？
RQ3EmpTransfo 架构对不同预训练语言模型规模是否具有鲁棒性？
RQ4添加上下文特征（话题/行动）对自动评估指标有何影响？

主要发现

Model	Hit@1 ↑	PPL ↓	F1 ↑	BLEU ↑
Seq2Seq+Attention	9.41	129.3	10.22	5.58
Transformer ranker	17.20	-	26.37	15.79
OpenAI GPT without emotion	75.01	10.19	18.2	3.755
EmpTransfo	77.25	10.63	19.39	3.99
EmpTransfo + topic	76.87	10.23	18.37	4.51
EmpTransfo + action	77.73	9.17	18.86	3.71
EmpTransfo + action + topic	78.47	9.04	17.27	2.45

EmpTransfo 及其上下文增强变体在 Hit@1 和 PPL 上优于基线。
添加话题和行动上下文比基础 EmpTransfo 取得更高的 Hit@1 和更低的 PPL。
包含情感、行动和话题特征的 EmpTransfo 在被测试的配置中在 Hit@1（78.47）和 PPL（9.04）上实现最佳整体提升。
OpenAI GPT 在没有情感的情况下在 Hit@1 上已经表现强劲（75.01），但在其他指标上仍有提升空间。
下一个话语的情感预测达到 Precision 81.35、Recall 72.37、F1 76.59，优于之前的基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。