QUICK REVIEW

[论文解读] Enriching Pre-trained Language Model with Entity Information for Relation Classification

Shanchan Wu, Yifan He|arXiv (Cornell University)|May 20, 2019

Topic Modeling参考文献 16被引用 30

一句话总结

本文提出 R-BERT，一种关系分类模型，通过在目标实体周围插入特殊标记并结合其上下文嵌入与句子表示，增强了 BERT。该方法在 SemEval-2010 任务 8 数据集上实现了 89.25 的新 SOTA 宏 F1 分数，显著优于先前方法，更好地捕捉了句子级语义和实体特异性信息。

ABSTRACT

Relation classification is an important NLP task to extract relations between entities. The state-of-the-art methods for relation classification are primarily based on Convolutional or Recurrent Neural Networks. Recently, the pre-trained BERT model achieves very successful results in many NLP classification / sequence labeling tasks. Relation classification differs from those tasks in that it relies on information of both the sentence and the two target entities. In this paper, we propose a model that both leverages the pre-trained BERT language model and incorporates information from the target entities to tackle the relation classification task. We locate the target entities and transfer the information through the pre-trained architecture and incorporate the corresponding encoding of the two entities. We achieve significant improvement over the state-of-the-art method on the SemEval-2010 task 8 relational dataset.

研究动机与目标

通过将实体级信息整合到 BERT 等预训练语言模型中，提升关系分类性能。
解决标准 BERT 在捕捉句子上下文与特定实体关系方面的局限性。
开发一种方法，在充分利用预训练表示的同时，显式编码目标实体的位置与特征。
在 SemEval-2010 任务 8 关系分类基准上实现 SOTA 性能。

提出的方法

在输入序列中，于两个目标实体前后的位置插入特殊标记 ‘$’ 和 ‘#’，以突出 BERT 中的实体位置。
使用 BERT 模型对包含特殊标记的完整序列进行编码，生成上下文化表示。
提取对应于两个实体特殊标记的最终隐藏状态，并将其与 [CLS] 标记表示拼接，用于分类。
将拼接后的向量（句子 + 实体表示）输入多层前馈神经网络，进行关系预测。
使用标准交叉熵损失，对整个模型进行端到端微调。
在所有新增层中应用 Dropout，并使用 Adam 优化器，基础初始学习率为 2e-5。

实验结果

研究问题

RQ1通过显式引入实体信息来增强预训练语言模型，是否能提升关系分类性能？
RQ2在实体周围引入特殊标记，如何影响模型对目标实体的定位与表征能力？
RQ3实体特异性表示在句子级编码之外，对最终分类的贡献程度如何？
RQ4所提出的方法是否在 SemEval-2010 任务 8 等标准基准上超越现有 SOTA 模型？

主要发现

R-BERT 在 SemEval-2010 任务 8 数据集上实现了 89.25 的宏 F1 分数，超越所有先前方法。
消融实验表明，若同时移除特殊标记与实体表示，性能降至 81.09 F1，表明二者均至关重要。
仅移除特殊标记（BERT-NO-SEP）后，F1 降至 87.98，表明通过标记实现的实体定位对性能至关重要。
仅移除实体表示（BERT-NO-ENT）后，F1 为 87.99，证实实体特异性特征在句子编码之外提供了额外增益。
该模型显著优于先前 SOTA 方法 Entity Attention Bi-LSTM（其 F1 为 85.2）。
结果表明，将句子级上下文与显式实体表示相结合，可实现更优的关系分类性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。