[论文解读] A Federated and Parameter-Efficient Framework for Large Language Model Training in Medicine
论文介绍了 Fed-MedLoRA,这是一个联邦学习且参数高效的框架,用于通过低秩适配器和自适应聚合将大语言模型应用于医疗任务,并在跨多个队列与基线的临床信息提取评估中进行评估。
Large language models (LLMs) have demonstrated strong performance on medical benchmarks, including question answering and diagnosis. To enable their use in clinical settings, LLMs are typically further adapted through continued pretraining or post-training using clinical data. However, most medical LLMs are trained on data from a single institution, which faces limitations in generalizability and safety in heterogeneous systems. Federated learning (FL) is a promising solution for enabling collaborative model development across healthcare institutions. Yet applying FL to LLMs in medicine remains fundamentally limited. First, conventional FL requires transmitting the full model during each communication round, which becomes impractical for multi-billion-parameter LLMs given the limited computational resources. Second, many FL algorithms implicitly assume data homogeneity, whereas real-world clinical data are highly heterogeneous across patients, diseases, and institutional practices. We introduce the model-agnostic and parameter-efficient federated learning framework for adapting LLMs to medical applications. Fed-MedLoRA transmits only low-rank adapter parameters, reducing communication and computation overhead, while Fed-MedLoRA+ further incorporates adaptive, data-aware aggregation to improve convergence under cross-site heterogeneity. We apply the framework to clinical information extraction (IE), which transforms patient narratives into structured medical entities and relations. Accuracy was assessed across five patient cohorts through comparisons with BERT models, and LLaMA-3 and DeepSeek-R1, GPT-4o models. Evaluation settings included (1) in-domain training and testing, (2) external validation on independent cohorts, and (3) a low-resource new-site adaptation scenario using real-world clinical notes from the Yale New Haven Health System.
研究动机与目标
- 需要跨机构训练出具备泛化能力和安全性的医疗大语言模型的动机。
- 提出一个联邦式、参数高效的框架(Fed-MedLoRA)用于将 LLM 应用于医学用途。
- 通过自适应、数据感知的聚合在数据异质性下提升收敛性(Fed-MedLoRA+)。
- 展示在临床信息提取中的适用性并与强基线进行比较。
- 在领域内、外部和低资源站点适应场景下进行评估。
提出的方法
- 仅传输低秩适配器参数以降低通信和计算量(Fed-MedLoRA)。
- 结合自适应、数据感知的聚合以在跨站点异质性下改善收敛性(Fed-MedLoRA+)。
- 将框架应用于将患者叙述信息提取为结构化实体和关系的临床信息提取。
- 在多组队列中对比 BERT、LLaMA-3、DeepSeek-R1 和 GPT-4o 的表现。
- 在域内训练/测试、对独立队列的外部验证,以及使用耶鲁新罕布什尔医院系统真实世界病历的低资源适应性进行评估。
实验结果
研究问题
- RQ1联邦学习结合参数高效适配器能否在多机构间有效训练医疗领域的 LLM?
- RQ2Fed-MedLoRa 与 Fed-MedLoRa+ 在临床信息提取方面相对于强基线的表现如何?
- RQ3自适应、数据感知的聚合是否在跨站点异质性下提升了收敛性?
- RQ4在真实世界临床笔记的低资源适应场景下,该框架的表现如何?
主要发现
- Fed-MedLoRa 与 Fed-MedLoRa+ 在五个队列的临床信息提取任务中,与基线相比具有竞争性的准确性。
- 由于自适应聚合,Fed-MedLoRa+ 在跨站点异质性下提升了收敛性。
- 该框架在三种设置下进行评估:域内训练/测试、对独立队列的外部验证以及低资源站点适应。
- 与 BERT 模型、LLaMA-3、DeepSeek-R1 和 GPT-4o 模型进行比较。
- 实验利用耶鲁新罕布什尔健康系统的真实世界临床笔记用于低资源适应。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。