QUICK REVIEW

[論文レビュー] CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

Yuying Zhu, Guoxin Wang|arXiv (Cornell University)|Apr 3, 2019

Topic Modeling参考文献 43被引用数 68

ひとこと要約

CAN-NER は、word embeddings や外部語彙を使わずに中国語NERを実現する、局所畳み込み注意層を持つ文字ベースのCNNと、グローバル自己注意層を備えた BiGRU-CRF を導入し、複数のドメインにおいて最先端の結果を達成します。

ABSTRACT

Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional Attention Network called CAN for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Also, compared to other models, not depending on any external resources like lexicons and employing small size of char embeddings make our model more practical. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domain datasets including Weibo, MSRA and Chinese Resume NER dataset.

研究の動機と目的

語彙分割、埋め込み、語彙表への依存なしに、堅牢な中国語NERを動機づける。
局所的な文脈と長距離依存性を捉える文字レベルのモデルを開発する。
局所的な文字間の関係を強化する畳み込み注意機構を統合する。
文レベルの文脈をモデル化するためにグローバルな自己注意層を組み込む。
外部リソースなしで、さまざまな領域における有効性を実証する。

提案手法

BiGRU-CRFを主要な系列ラベリングフレームワークとして用いる。
窓内の各文字の周囲にある局所的な文脈情報を符号化する畳み込み注意層を追加する。
CNNへの入力として、セグメンテーション情報（BMES）と文字埋め込みを結合する。
各窓内で局所的な注意重みを計算し、隠れ表現を形成する。
BiGRUの出力に対してグローバル自己注意層を適用し、長距離依存を捉える。
デコードのために、結合されたBiGRUとグローバル注意出力の上にCRF層を置く。

実験結果

リサーチクエスチョン

RQ1完全な文字ベースのモデルは、語彙埋め込みや語彙表なしで、複数のドメインにおいて競争力のあるNER性能を達成できるか？
RQ2局所的な畳み込み注意機構は、標準のCNNと比較して近接する文字間の相互作用のモデリングを改善するか？
RQ3グローバル自己注意層は、長距離の文の依存関係を効果的に捉え、中国語のNER性能を向上させるか？
RQ4提案されたCAN-NERは、Weibo、MSRA、Chinese Resume、OntoNotes データセットで、外部リソースなしで最先端モデルと比べてどのように性能を示すか？

主な発見

CAN-NER はベースラインを上回り、いくつかのデータセットで文字ベースモデルの中で最先端の結果を達成する。
畳み込み注意は、ローカルな文字間の関係をよりよく捉えることにより、標準的なCNN特徴より顕著な改善をもたらす。
Global self-attention on BiGRU outputs helps model long-range sentence context beyond the capabilities of vanilla BiGRU-CRF, improving F1 scores.
The model operates without external word embeddings or lexicon resources, offering a more practical NER solution.
Results show strong performance on Weibo and Chinese Resume datasets, with competitive outcomes on MSRA and OntoNotes datasets.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。