QUICK REVIEW

[论文解读] Template-free Prompt Tuning for Few-shot NER

Ruotian Ma, Xin Zhou|arXiv (Cornell University)|Sep 28, 2021

Topic Modeling被引用 27

一句话总结

本文提出 EntLM，一种无模板的提示微调方法，将 NER 重塑为面向实体的 LM 任务，从而实现高效的一次解码，并在不增加新参数的情况下提升少样本性能。

ABSTRACT

Prompt-based methods have been successfully applied in sentence-level few-shot learning tasks, mostly owing to the sophisticated design of templates and label words. However, when applied to token-level labeling tasks such as NER, it would be time-consuming to enumerate the template queries over all potential entity spans. In this work, we propose a more elegant method to reformulate NER tasks as LM problems without any templates. Specifically, we discard the template construction process while maintaining the word prediction paradigm of pre-training models to predict a class-related pivot word (or label word) at the entity position. Meanwhile, we also explore principled ways to automatically search for appropriate label words that the pre-trained models can easily adapt to. While avoiding complicated template-based process, the proposed LM objective also reduces the gap between different objectives used in pre-training and fine-tuning, thus it can better benefit the few-shot performance. Experimental results demonstrate the effectiveness of the proposed method over bert-tagger and template-based method under few-shot setting. Moreover, the decoding speed of the proposed method is up to 1930.12 times faster than the template-based method.

研究动机与目标

在搜索模板成本高、 Span 枚举不可行时，推动改进的少样本 NER。
提出 Entity-oriented LM (EntLM) 微调，在实体位点无需模板即可预测标签词。
研究标签词工程方法，以识别合适的离散标签词或虚拟标签词。
证明 EntLM 能缩小预训练与微调之间的差距，从而提升少样本性能。

提出的方法

通过使用 EntLM 目标将实体标记替换为与类别相关的标签词，将 NER 公式化为一个 LM 任务。
复用预训练的 LM 头；微调阶段不引入新参数。
开发标签词工程方法，包括离散词和虚拟原型。
探索数据分布、LM 输出分布及二者的组合来选择标签词；可选地使用词典衍生的注释。
允许一次解码获得所有实体标签，而无需枚举 span。
可选地在与 Struct 基于解码结合时应用 Viterbi 解码器以进一步提升性能。

实验结果

研究问题

RQ1在少样本情景下，是否可以无需模板有效地将 NER 重新表述为一个 LM 目标？
RQ2哪些标签词策略（离散 vs. 虚拟；数据驱动 vs. LM 驱动）在低资源设置下最有效地支持 EntLM？
RQ3在少样本情形下，EntLM 与基于模板的提示方法以及标准微调相比如何？
RQ4与基于模板的方法相比，EntLM 是否保持高效的解码？
RQ5词典质量和领域自适应预训练对 EntLM 性能的影响是什么？

主要发现

在所有少样本设置中，EntLM 在 CoNLL03、OntoNotes 5.0 和 MIT-Movie 上均超越 BERT-tagger 和基于模板的 NER。
EntLM 比基线具有更高的稳定性（偏差更小），尤其是在 5-shot 时。
使用 EntLM 的解码比基于模板的方法快得多（高达 1930.12x）。
通过 Data+LM+Virtual 结合策略进行标签词工程，即使词典很小也能提供稳健的性能。
在未标注数据上进行进一步的领域特定 MLM 预训练，能显著提升 EntLM 的性能，效果比基于分类器的微调更显著。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。