QUICK REVIEW

[论文解读] Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition

Jianlin Su, Murtadha Ahmed|arXiv (Cornell University)|Aug 5, 2022

Topic Modeling被引用 65

一句话总结

Global Pointer (GP) 引入了一种基于跨度的 NER 模型，使用相对位置上的乘法注意力来同时对跨度边界和实体类型进行评分，并提供一种高效变体以减少参数量以及用于处理类别不平衡的通用损失。它在平坦和嵌套 NER 数据集上实现了最先进或具有竞争力的结果，同时降低了训练和推理成本。

ABSTRACT

Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types such as person, location, organization, etc. The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts. The existing span-based approaches overcome this limitation, but the computation time is still a concern. In this work, we propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism. The ultimate goal is to enable a global view that considers the beginning and the end positions to predict the entity. To this end, we design two modules to identify the head and the tail of a given entity to enable the inconsistency between the training and inference processes. Moreover, we introduce a novel classification loss function to address the imbalance label problem. In terms of parameters, we introduce a simple but effective approximate method to reduce the training parameters. We extensively evaluate GP on various benchmark datasets. Our extensive experiments demonstrate that GP can outperform the existing solution. Moreover, the experimental results show the efficacy of the introduced loss function compared to softmax and entropy alternatives.

研究动机与目标

推动开发有效的基于跨度的 NER，能够捕捉边界信息并处理嵌套实体。
提出 Global Pointer，通过乘法注意力机制利用相对位置信息。
通过专门的损失函数和高效的参数约简变体，解决训练-推理不一致性和标签不平衡问题。
展示 GP 在多样化基准数据集上的有效性和高效性。

提出的方法

从预训练语言模型（如 BERT）计算标记表示。
使用起始和结束索引构建跨度表示，每个实体类型有两个前馈投影。
用 s_alpha(i,j) = q_{i,alpha}^T k_{j,alpha} 对跨度进行打分，并结合 ROPE 相对位置编码。
通过共享提取参数并加入一个轻量级分类项来降低参数增长，引入 Efficient Global Pointer。
提出受 circle loss 启发的通用多标签损失，用于解决 NER 中的类别不平衡，并通过阈值简化。
提供一个近似的参数约简变体，在更少的参数下保持性能。

实验结果

研究问题

RQ1Global Pointer 能否在平坦和嵌套 NER 数据集上实现比强基线更高的 Macro-F1？
RQ2引入相对位置编码（ROPE）是否能提升基于跨度的 NER 性能？
RQ3所提出的通用不平衡损失是否比标准的 Softmax/交叉熵损失在 NER 中更有效？
RQ4Efficient Global Pointer 能否在不牺牲准确性的前提下降低训练参数？
RQ5GP 在不同数据集上的训练与推理效率表现如何？

主要发现

Method	The People’s daily	CLUENER	CMeEE	CONLL04	Genia
Bert-CRF	95.46	78.70	64.39	85.46	73.02
PFN Yan et al.	94.00	79.29	63.68	87.43	74.31
Global Pointer	95.51	79.44	65.98	88.57	74.64

Global Pointer 在所评估的数据集上实现了比基线更高的 Macro-F1。
GP 在像 CLUENER 和 CMeEE 这样的挑战性数据集上显著优于 BERT-CRF。
Efficient Global Pointer 在更少参数的情况下保持竞争性性能，特别是在更难的数据集上。
消融研究显示 ROPE 相对位置编码和提出的不平衡损失相对于 BCE 可带来显著提升。
在较大数据集上，GP 提供比 BERT-CRF 更快的训练和推理。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。