QUICK REVIEW

[论文解读] Explainable artificial intelligence model to predict acute critical illness from electronic health records

Simon Meyer Lauritsen, Mads Ruben Burgdorff Kristensen|arXiv (Cornell University)|Dec 3, 2019

Machine Learning in Healthcare参考文献 55被引用 23

一句话总结

本文提出 xAI-EWS，一种可解释人工智能模型，利用时间卷积网络（TCN）进行预测，通过深度泰勒分解（DTD）实现可解释性，从电子健康记录（EHR）中预测急性危重症。该系统在预测性能方面表现优异（AKI 的 AUROC 最高达 0.91），同时为临床医生提供透明的、针对每个实例的解释，明确指出影响每项预测的 EHR 变量，从而增强信任度并提升临床实用性。

ABSTRACT

We developed an explainable artificial intelligence (AI) early warning score (xAI-EWS) system for early detection of acute critical illness. While maintaining a high predictive performance, our system explains to the clinician on which relevant electronic health records (EHRs) data the prediction is grounded. Acute critical illness is often preceded by deterioration of routinely measured clinical parameters, e.g., blood pressure and heart rate. Early clinical prediction is typically based on manually calculated screening metrics that simply weigh these parameters, such as Early Warning Scores (EWS). The predictive performance of EWSs yields a tradeoff between sensitivity and specificity that can lead to negative outcomes for the patient. Previous work on EHR-trained AI systems offers promising results with high levels of predictive performance in relation to the early, real-time prediction of acute critical illness. However, without insight into the complex decisions by such system, clinical translation is hindered. In this letter, we present our xAI-EWS system, which potentiates clinical translation by accompanying a prediction with information on the EHR data explaining it.

研究动机与目标

开发一种可解释人工智能系统，与传统早期预警评分相比，能更早、更准确地预测急性危重症。
通过提供模型预测的可解释、实例级解释，解决临床医生对黑箱 AI 模型的不信任问题。
利用深度学习有效整合时间序列 EHR 数据，同时保持可解释性，以支持实时临床决策。
使用真实世界丹麦 EHR 数据，在三种危重疾病（脓毒症、AKI 和 ALI）上验证该模型。

提出的方法

xAI-EWS 系统使用时间卷积网络（TCN）处理序列化 EHR 数据，将急性危重症风险预测为 0–1 范围内的概率值。
应用深度泰勒分解（DTD）对 TCN 的预测结果进行解释，通过为输入的 EHR 变量（如血压和心率）分配相关性得分，实现可解释性。
模型在包含 66,288 名独立患者、共 163,050 次住院记录的多中心丹麦 EHR 数据集（2012–2017 年）上进行训练。
采用五折交叉验证进行模型评估，性能指标为 AUROC 和 AUPRC。
通过急诊医学专家的手动检查验证解释的临床相关性和合理性。
基线模型包括 MEWS、SOFA 和 GB-Vital，与 xAI-EWS 在预测性能和可解释性方面进行对比。

实验结果

研究问题

RQ1可解释人工智能模型是否能在使用 EHR 数据预测急性危重症方面，优于传统早期预警评分？
RQ2深度泰勒分解在多大程度上能为重症监护中复杂深度学习预测提供临床意义明确且可解释的解释？
RQ3xAI-EWS 系统在真实世界 EHR 环境下，对脓毒症、AKI 和 ALI 等多种危重疾病的表现如何？
RQ4将时间建模与可解释人工智能相结合，是否能提升临床对人工智能在实时患者监测中应用的信任度与采纳率？
RQ5在基于 EHR 的危重症预测中，预测性能与可解释性之间存在何种权衡？

主要发现

xAI-EWS 模型在急性肾损伤（AKI）预测中达到 0.91 的 AUROC，表现出优异的预测性能。
在脓毒症预测中，模型 AUROC 达到 0.88，显著优于 MEWS 和 SOFA 等基线模型。
GB-Vital 基线模型在预测性能上低于 xAI-EWS，尤其在危重症的早期检测方面表现更差。
临床医生验证的解释表明，模型正确识别出关键生理趋势（如血压下降和心率上升）作为预测因素。
DTD 的应用实现了对预测决策向相关 EHR 变量的可靠、实例特定的归因，显著提升了临床可解释性。
该模型在所有三种疾病（脓毒症、AKI、ALI）中均保持高性能，证实其在多种急性危重症中的泛化能力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。