QUICK REVIEW

[论文解读] BERT for Joint Intent Classification and Slot Filling

Chen Qian, Zhuo Zhu|arXiv (Cornell University)|Feb 28, 2019

Topic Modeling参考文献 20被引用 412

一句话总结

本文提出一种基于 BERT 的联合意图分类与槽位填充模型，利用其上下文表征提升低资源 NLU 任务的泛化能力。该模型在 Snips 和 ATIS 数据集上达到最先进性能，意图准确率提升最高达 +1.6%，槽位 F1 提升最高达 +1.9%，句级框架准确率相对提升高达 +22.9%。

ABSTRACT

Intent classification and slot filling are two essential tasks for natural language understanding. They often suffer from small-scale human-labeled training data, resulting in poor generalization capability, especially for rare words. Recently a new language representation model, BERT (Bidirectional Encoder Representations from Transformers), facilitates pre-training deep bidirectional representations on large-scale unlabeled corpora, and has created state-of-the-art models for a wide variety of natural language processing tasks after simple fine-tuning. However, there has not been much effort on exploring BERT for natural language understanding. In this work, we propose a joint intent classification and slot filling model based on BERT. Experimental results demonstrate that our proposed model achieves significant improvement on intent classification accuracy, slot filling F1, and sentence-level semantic frame accuracy on several public benchmark datasets, compared to the attention-based recurrent neural network models and slot-gated models.

研究动机与目标

解决因人类标注训练数据有限而导致的 NLU 模型泛化能力差的问题。
探索 BERT 预训练在联合意图分类与槽位填充任务中的有效性。
通过上下文表征学习提升低资源和罕见词场景下的性能。
证明意图与槽位任务的联合建模可提升整体语义解析准确率。
在 Snips 和 ATIS 等多样化、真实世界基准上评估模型，以验证其鲁棒性。

提出的方法

使用端到端训练在联合意图与槽位标注任务上微调 uncased BERT-Base 模型。
使用 [CLS] token 表征进行意图分类，通过 softmax 层：yi = softmax(Wi h1 + bi)。
使用每个词的第一个子词隐藏状态进行槽位标注：ys_n = softmax(Ws hn + bs)，其中 hn 对应该词 xn 的第一个子词。
将联合目标表述为 p(yi, ys|x) = p(yi|x) × ∏ p(ys_n|x)，通过交叉熵损失最大化条件似然。
在 BERT 上集成 CRF 层以建模槽位填充中的标签依赖关系，提升序列级一致性。
在 BooksCorpus 和 Wikipedia 上进行标准 BERT 预训练，随后使用 Adam 和 dropout 进行任务特定微调。

实验结果

研究问题

RQ1BERT 预训练是否能在标注数据有限的情况下显著提升意图分类与槽位填充的泛化能力？
RQ2意图与槽位任务的联合建模是否优于独立建模？
RQ3BERT 基于联合建模的方法在基准数据集上与先前最先进 RNN 和注意力机制模型相比表现如何？
RQ4BERT 在领域不匹配的文本（如 Wikipedia）上进行预训练，在罕见短语上的零样本或少样本泛化中能带来多大程度的收益？
RQ5添加 CRF 层是否能通过捕捉标签依赖关系进一步提升槽位标注性能？

主要发现

在 Snips 数据集上，联合 BERT 模型实现 98.6% 的意图分类准确率，较之前最先进方法（97.0%）提升 1.6 个百分点。
在槽位填充任务中，模型达到 97.0% 的 F1，较槽位门控模型（88.8%）提升 8.2 个百分点。
Snips 上的句级语义框架准确率提升至 92.8%，较之前最佳模型（75.5%）相对提升 22.9%。
在 ATIS 上，模型实现 97.5% 的意图准确率（较之前 94.1% 提升）和 96.1% 的槽位 F1（较之前 95.2% 提升），框架准确率为 88.2%（较之前 82.6% 提升）。
仅微调一个训练周期的联合 BERT 模型即优于所有先前模型，表明其具有强大的数据效率。
案例研究显示，由于在 Wikipedia 上进行过预训练，BERT 能正确识别如 'mother joan of the angels' 等罕见短语为电影名称，展现出优越的泛化能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。