[论文解读] SciFive: a text-to-text transformer model for biomedical literature
SciFive 是一个基于 T5 的领域特定模型,在大规模生物医学语料库(C4、PubMed、PMC)上进行预训练,在生物医学 NLP 任务中达到行业领先或接近行业领先的结果,包括命名实体识别、关系提取、自然语言推理,尤其是问答;它还在更长文本生成任务上表现出强劲的性能。
In this report, we introduce SciFive, a domain-specific T5 model that has been pre-trained on large biomedical corpora. Our model outperforms the current SOTA methods (i.e. BERT, BioBERT, Base T5) on tasks in named entity relation, relation extraction, natural language inference, and question-answering. We show that text-generation methods have significant potential in a broad array of biomedical NLP tasks, particularly those requiring longer, more complex outputs. Our results support the exploration of more difficult text generation tasks and the development of new methods in this area
研究动机与目标
- Motivate the need for language models trained on dense biomedical language to support literature mining and analysis.
- Propose SciFive, a domain-adapted T5 model pretrained on biomedical corpora to enable text-to-text biomedical tasks.
- Demonstrate SciFive's performance advantages on NER, RE, NLI, document classification, and QA tasks over prior BERT-based and T5-based baselines.
提出的方法
- Adopt the T5 sequence-to-sequence framework and retain its architecture and pretraining objectives (span-based masking) to enable text generation tasks.
- Pretrain SciFive from the base T5 weights on combinations of biomedical corpora (C4, PubMed abstracts, PMC full text) for up to 1.2M steps.
- Represent all tasks as text-to-text problems with a task-specific prompt token for multi-task fine-tuning.
- Use SentencePiece tokenization to build a subword vocabulary suitable for biomedical text.
- Fine-tune SciFive on five biomedical NLP task categories (NER, RE, NLI, document classification, QA) in both multi-task and single-task settings.
- Evaluate on benchmark datasets and compare against SOTA methods (BioBERT, BlueBERT, BERT, T5).
实验结果
研究问题
- RQ1Can a unified text-to-text transformer trained on biomedical corpora outperform BERT-based models on standard biomedical NLP tasks?
- RQ2Does SciFive provide competitive or superior results on longer-output generation tasks such as QA and summarization compared with prior models?
- RQ3What is the impact of different biomedical corpora (C4, PubMed, PMC) on SciFive's performance across tasks?
- RQ4Is multi-task fine-tuning beneficial for NER and related biomedical tasks when using a text-to-text framework?
- RQ5How does SciFive perform on BioASQ question answering under lenient accuracy evaluation compared with BioBERT and T5?
主要发现
- SciFive achieves state-of-the-art results on 3 of 7 NER tasks, 2 of 2 RE tasks, and 1 of 1 NLI task.
- SciFive achieves state-of-the-art results on all three BioASQ QA tasks under lenient accuracy in expert assessment.
- SciFive shows strong performance on QA, often outperforming BioBERT and competing with or surpassing T5 and other baselines on generation-heavy tasks.
- SciFive provides near-SOTA performance on the HoC document classification task, indicating competitive document-level classification alongside generation capabilities.
- The PubMed+PMC corpus configuration did not universally outperform other corpus combinations, indicating the need for further study of optimal biomedical corpora mix.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。