QUICK REVIEW

[論文レビュー] BERT-based Ranking for Biomedical Entity Normalization

Zongcheng Ji, Qiang Wei|arXiv (Cornell University)|Aug 9, 2019

Biomedical Text Mining and Ontologies参考文献 32被引用数 93

ひとこと要約

本論文は事前学習済みの BERT、BioBERT、ClinicalBERT モデルを生物医療分野の実体正規化に微調整し、3つのデータセットで最先端の精度向上を示し、従来手法より最大で 1.17% 高い精度を達成する。

ABSTRACT

Developing high-performance entity normalization algorithms that can alleviate the term variation problem is of great interest to the biomedical community. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings. Bidirectional Encoder Representations from Transformers (BERT), BERT for Biomedical Text Mining (BioBERT) and BERT for Clinical Text Mining (ClinicalBERT) were recently introduced to pre-train contextualized word representation models using bidirectional Transformers, advancing the state-of-the-art for many natural language processing tasks. In this study, we proposed an entity normalization architecture by fine-tuning the pre-trained BERT / BioBERT / ClinicalBERT models and conducted extensive experiments to evaluate the effectiveness of the pre-trained models for biomedical entity normalization using three different types of datasets. Our experimental results show that the best fine-tuned models consistently outperformed previous methods and advanced the state-of-the-art for biomedical entity normalization, with up to 1.17% increase in accuracy.

研究の動機と目的

生物医療実体正規化における用語の変異に対処する。
正規化タスクに対する事前学習済みの文脈化表現の有効性を探る。
複数の生物医療データセットで BERT 系の変種を評価し、性能向上を確立する。

提案手法

生物医療実体正規化タスクに対して、事前学習済みの BERT、BioBERT、ClinicalBERT モデルを微調整する。
従来の正規化手法と比較して性能向上を評価する。
一般化を評価するために、3つの異なるデータセットタイプにわたって広範な実験を行う。

実験結果

リサーチクエスチョン

RQ1微調整済みの BERT ベースモデルは、多様なデータセットにおいて既存の生物医療実体正規化手法を上回れるか？
RQ2どの事前学習済み BERT バリアント（BERT、BioBERT、ClinicalBERT）が最も良い正規化性能を発揮するか？
RQ3データセット全体で、従来手法に対する微調整による精度向上の規模はどれほどか？

主な発見

微調整された BERT、BioBERT、ClinicalBERT モデルは一貫して従来の手法を上回る。
最良の微調整モデルは、生物医療実体正規化タスクで最先端の精度を達成する。
報告された改善は、データセット全体で従来手法より最大 1.17% の精度向上を含む。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。