[论文解读] Large Language Model Distilling Medication Recommendation Model
LEADER 使用一个带有分类输出层的微调后的大语言模型(LLM)进行药物推荐,并通过特征级知识蒸馏与画像对齐将其语义能力传递给一个紧凑的学生模型,从而在单次就诊和多次就诊患者中实现有效的推荐。
The recommendation of medication is a vital aspect of intelligent healthcare systems, as it involves prescribing the most suitable drugs based on a patient's specific health needs. Unfortunately, many sophisticated models currently in use tend to overlook the nuanced semantics of medical data, while only relying heavily on identities. Furthermore, these models face significant challenges in handling cases involving patients who are visiting the hospital for the first time, as they lack prior prescription histories to draw upon. To tackle these issues, we harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs). Our research aims to transform existing medication recommendation methodologies using LLMs. In this paper, we introduce a novel approach called Large Language Model Distilling Medication Recommendation (LEADER). We begin by creating appropriate prompt templates that enable LLMs to suggest medications effectively. However, the straightforward integration of LLMs into recommender systems leads to an out-of-corpus issue specific to drugs. We handle it by adapting the LLMs with a novel output layer and a refined tuning loss function. Although LLM-based models exhibit remarkable capabilities, they are plagued by high computational costs during inference, which is impractical for the healthcare sector. To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model. Extensive experiments conducted on two real-world datasets, MIMIC-III and MIMIC-IV, demonstrate that our proposed model not only delivers effective results but also is efficient. To ease the reproducibility of our experiments, we release the implementation code online.
研究动机与目标
- 解决现有药物推荐系统在语义理解不足和单次就诊局限性方面的问题。
- 利用大语言模型在输入无关性方面提升药物推荐能力。
- 开发一个高效的蒸馏框架,将LLM知识转移到更小、推理友好的模型。
- 通过公开 LEADER 的代码来确保可重复性。
提出的方法
- 设计提示模板,将电子病历数据转换为自然语言以输入到LLM。
- 通过添加分类输出层和有监督微调损失(二元交叉熵)来修改LLM,以输出药物概率。
- 应用以 LoRA 为基础的微调,仅更新轻量参数。
- 引入两阶段蒸馏:(i) 训练教师模型 LEADER(T);(ii) 训练学生模型 LEADER(S),使用 BCE 损失、特征级 KD 损失和画像对齐损失。
- 开发诊断、过程和药物三个编码器;一个共享就诊编码器;以及一个最终的两层投影以预测药物。
- 实现使用对比损失对齐画像嵌入与药物表示的画像对齐策略。
实验结果
研究问题
- RQ1LEADER 与最先进的药物推荐模型及其他基于LLM的推荐相比如何?
- RQ2LEADER 的各个单独设计组件是否对性能提升有贡献?
- RQ3知识蒸馏和画像对齐对 LEADER 性能的影响是什么?
- RQ4学生模型在保持准确性的同时能否实现高效率?
主要发现
- LEADER(T) 在 MIMIC-III 和 MIMIC-IV 相对于基线,在 overall、multi-visit 和 single-visit 的 PRAUC、Jaccard 和 F1 指标上均达到最高。
- LEADER(S) 超越了若干基线,并优于部分基于LLM的方法,得益于特征级知识蒸馏与画像对齐。
- 消融研究表明去除KD或对齐会降低性能,证实了每个组件的有效性。
- 单次就诊的性能显著受益于LLM语义理解和画像对齐,使得在没有既往处方历史的情况下也能提供稳健的推荐。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。