QUICK REVIEW

[論文レビュー] Medical mT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain

Iker García-Ferrero, Rodrigo Agerri|arXiv (Cornell University)|Apr 11, 2024

Topic Modeling被引用数 10

ひとこと要約

本論文は Medical mT5 を提示する。医療分野向けの初のオープンソースの多言語テキスト対テキストLLMであり、大規模な多言語医療コーパスで訓練され、英語、スペイン語、フランス語、イタリア語のタスクで評価されている。多言語シーケンスラベリングで強い結果を達成し、英語のQAにおいて最先端モデルと対抗できる性能を示す。

ABSTRACT

Research on language technology for the development of medical applications is currently a hot topic in Natural Language Understanding and Generation. Thus, a number of large language models (LLMs) have recently been adapted to the medical domain, so that they can be used as a tool for mediating in human-AI interaction. While these LLMs display competitive performance on automated medical texts benchmarks, they have been pre-trained and evaluated with a focus on a single language (English mostly). This is particularly true of text-to-text models, which typically require large amounts of domain-specific pre-training data, often not easily accessible for many languages. In this paper, we address these shortcomings by compiling, to the best of our knowledge, the largest multilingual corpus for the medical domain in four languages, namely English, French, Italian and Spanish. This new corpus has been used to train Medical mT5, the first open-source text-to-text multilingual model for the medical domain. Additionally, we present two new evaluation benchmarks for all four languages with the aim of facilitating multilingual research in this domain. A comprehensive evaluation shows that Medical mT5 outperforms both encoders and similarly sized text-to-text models for the Spanish, French, and Italian benchmarks, while being competitive with current state-of-the-art LLMs in English.

研究の動機と目的

多言語でオープンソースの医療LLMが不足している問題に対処するため、巨大な多言語医療コーパスを構築し、テキスト対テキスト型のモデルを訓練する。
英語に加えてスペイン語、フランス語、イタリア語のベンチマークを提供し、多言語医療NLP研究を促進する。
専門領域データでの継続的事前学習が非英語言語の性能を向上させることを示す。
マルチタスクおよびゼロショットのクロスリンガル設定におけるモデルの有効性を示す。

提案手法

公開ソースから総計約3十億語の大規模な多言語医療コーパス（英語、スペイン語、フランス語、イタリア語）を構築する。
既存の mT5 チェックポイントを医療コーパスでファインチューニングして、Medical-mT5-large (738M パラメータ) および Medical-mT5-xl (3B パラメータ) を作成する。
事前学習は元の mT5 の作業にある span-corruption 目的と自己教師あり設定に従い、シーケンス長はハードウェアに制約される。
スペイン語、フランス語、イタリア語のための新しい多言語評価データセットを二つ導入する：Argument Mining および Abstractive Question Answering。
すべてのタスクをテキスト対テキストの問題として位置づけ、入力語と有効な注釈を維持するために制約付きデコードを適用する。
モノリンガルおよびマルチリンガル設定の中で、mT5、SciFive、Flan-T5、エンコーダーのみのモデルを含むベースラインと比較する。

実験結果

リサーチクエスチョン

RQ1医療データで訓練された多言語テキスト対テキストモデルは、英語、スペイン語、フランス語、イタリア語のシーケンスラベリングおよびQAタスクでうまく機能するだろうか？
RQ2医療データによるドメイン特化型事前学習は、非英語言語の多タスクおよびゼロショットのクロスリンガル設定で性能を向上させるか？
RQ3多言語医療生成の評価における課題と限界は何か、また Medical mT5 は英語および他言語の強力なベースラインと比べてどうか？

主な発見

Medical mT5 はスペイン語、フランス語、イタリア語のシーケンスラベリングベンチマークで、同程度サイズのテキスト対テキストのベースラインを上回る。
Medical mT5 は医療のテキスト対テキストタスクにおいて英語の最先端モデルと競合する。
ゼロショットのクロスリンガル転送は、英語データで微調整した場合、Non-English言語に対して Medical mT5 およびその xl 版が強力な結果を示す。
マルチタスクのファインチューニングは、スペイン語、フランス語、イタリア語において、単一タスク設定と比べて全体的な性能が最も高い。
より大きい Medical-mT5-xl は単一タスク設定で過学習する可能性があるが、マルチタスクおよびクロスリンガルの状況では卓越している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。