Skip to main content
QUICK REVIEW

[论文解读] FADL:Federated-Autonomous Deep Learning for Distributed Electronic Health Record

Dianbo Liu, Timothy A. Miller|arXiv (Cornell University)|Nov 28, 2018
Machine Learning in Healthcare参考文献 16被引用 64
一句话总结

本文介绍 Federated-Autonomous Deep Learning (FADL),一种分布式学习方法,在全局与本地训练之间取得平衡,以在不移动患者数据的情况下预测 ICU 死亡率。

ABSTRACT

Electronic health record (EHR) data is collected by individual institutions and often stored across locations in silos. Getting access to these data is difficult and slow due to security, privacy, regulatory, and operational issues. We show, using ICU data from 58 different hospitals, that machine learning models to predict patient mortality can be trained efficiently without moving health data out of their silos using a distributed machine learning strategy. We propose a new method, called Federated-Autonomous Deep Learning (FADL) that trains part of the model using all data sources in a distributed manner and other parts using data from specific data sources. We observed that FADL outperforms traditional federated learning strategy and conclude that balance between global and local training is an important factor to consider when design distributed machine learning methods , especially in healthcare.

研究动机与目标

  • Motivate learning from EHR data stored in silos across institutions without data transfer.
  • Develop a distributed learning framework that preserves privacy while leveraging global and local information.
  • Evaluate FADL against centralized and traditional federated learning on multi-hospital ICU mortality data.

提出的方法

  • Use eICU ICU data from 58 hospitals with 1,264,89 admissions and 1400 potential drug features as binary inputs.
  • Implement a three-layer neural network (500, 100, 1) with ReLU hidden layers and sigmoid output, trained with cross-entropy loss and L2 regularization.
  • Compare three training strategies: centralized learning, original federated learning, and Federated-Autonomous Learning (FADL).
  • In FADL, train the first half of the network globally across all sources, then locally train the second half for each data source to specialize per hospital.
  • In federated learning baseline, train identically-parameter models locally at each hospital and aggregate parameters weighted by sample size.
  • Evaluate models using AUC-ROC and AUCPR on a test set derived from the distributed data.

实验结果

研究问题

  • RQ1Does FADL improve predictive performance over traditional federated learning while maintaining privacy by not moving data?
  • RQ2Can balancing global (shared) and local (hospital-specific) training yield similar or better accuracy compared to centralized models?
  • RQ3How does FADL performance compare to centralized training and standard federated learning on ICU mortality prediction?

主要发现

  • Centralized learning achieved AUCROC 0.79 and AUCPR 0.21.
  • Original federated learning achieved AUCROC 0.75 and AUCPR 0.16.
  • FADL achieved AUCROC 0.79 and AUCPR 0.23.
  • FADL matches centralized performance on AUCROC and surpasses federated learning on AUCPR.
  • The balance between global and local training is identified as a key factor in distributed healthcare ML design.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。