QUICK REVIEW

[论文解读] ERNIE: Enhanced Language Representation with Informative Entities

Zhengyan Zhang, Xu Han|arXiv (Cornell University)|May 17, 2019

Topic Modeling参考文献 54被引用 133

一句话总结

ERNIE 在文本与知识图谱上通过引入信息性实体嵌入的专用知识模块与去噪实体自编码器进行预训练，在知识驱动任务上实现提升，同时在一般 NLP 任务上保持竞争力。

ABSTRACT

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code of this paper can be obtained from https://github.com/thunlp/ERNIE.

研究动机与目标

激发将来自知识图谱的外部知识整合以提升语言理解能力，而不仅仅是文本本身。
设计一个两模块架构，将文本信息与知识信息融合，获得更丰富的表示。
提出去噪实体自编码器（dEA）预训练任务，以使文本与知识图实体对齐。
在如实体类型判定与关系分类等知识驱动任务上展示改进，同时在常规 NLP 基准上保持竞争力。

提出的方法

两层堆叠模块：文本编码器（T-Encoder）和知识性编码器（K-Encoder），将标记信息与实体信息融合在一起。
实体表示通过 TransE 进行预训练，并通过命名实体识别以及对齐到知识图谱实体来与文本对齐。
一种新颖的预训练目标（dEA）随机掩盖标记-实体对齐，并训练模型从知识图嵌入中预测正确的实体。
BERT 的 MLM 与 NSP 目标保留，以捕捉词汇和句法信息。
在每个 K-Encoder 汇聚器内部的一个信息融合层通过学习到的变换将标记嵌入和实体嵌入结合起来。
针对下游任务的微调使用特定的输入格式，包括标记以突出实体提及以用于关系分类。

实验结果

研究问题

RQ1将 KG 派生的知识性实体纳入是否会提升知识驱动的 NLP 任务的表现？
RQ2是否存在专门的知识融合机制与 dEA 预训练比文本单独预训练更能对齐文本与实体信息？
RQ3与 BERT 和特定任务基线相比，ERNIE 在实体类型判定和关系分类上的表现如何？
RQ4ERNIE 对如 GLUE 这样的标准 NLP 基准的影响如何？

主要发现

ERNIE 在实体类型判定和关系分类任务（如 FewRel/TACRED）上明显优于 BERT，展示了信息性实体的好处。
在 Open Entity 上，ERNIE 相较于 BERT 提高了精确率和召回率，表明对 KG 知识在类型化中的更好利用。
ERNIE 在 FewRel 上达到更高的 F1 值，相较于 BERT 在 TACRED 的微平均指标也优于 BERT。
GLUE 结果显示在大数据集上 ERNIE 与 BERT 相当，在较小数据集上存在一些不稳定性。
消融分析表明信息性实体和 dEA 预训练都对性能提升有贡献。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。