QUICK REVIEW

[Paper Review] FADL:Federated-Autonomous Deep Learning for Distributed Electronic Health Record

Dianbo Liu, Timothy A. Miller|arXiv (Cornell University)|Nov 28, 2018

Machine Learning in Healthcare16 references64 citations

TL;DR

The paper introduces Federated-Autonomous Deep Learning (FADL), a distributed learning method that balances global and local training to predict ICU mortality without moving patient data.

ABSTRACT

Electronic health record (EHR) data is collected by individual institutions and often stored across locations in silos. Getting access to these data is difficult and slow due to security, privacy, regulatory, and operational issues. We show, using ICU data from 58 different hospitals, that machine learning models to predict patient mortality can be trained efficiently without moving health data out of their silos using a distributed machine learning strategy. We propose a new method, called Federated-Autonomous Deep Learning (FADL) that trains part of the model using all data sources in a distributed manner and other parts using data from specific data sources. We observed that FADL outperforms traditional federated learning strategy and conclude that balance between global and local training is an important factor to consider when design distributed machine learning methods , especially in healthcare.

Motivation & Objective

Motivate learning from EHR data stored in silos across institutions without data transfer.
Develop a distributed learning framework that preserves privacy while leveraging global and local information.
Evaluate FADL against centralized and traditional federated learning on multi-hospital ICU mortality data.

Proposed method

Use eICU ICU data from 58 hospitals with 1,264,89 admissions and 1400 potential drug features as binary inputs.
Implement a three-layer neural network (500, 100, 1) with ReLU hidden layers and sigmoid output, trained with cross-entropy loss and L2 regularization.
Compare three training strategies: centralized learning, original federated learning, and Federated-Autonomous Learning (FADL).
In FADL, train the first half of the network globally across all sources, then locally train the second half for each data source to specialize per hospital.
In federated learning baseline, train identically-parameter models locally at each hospital and aggregate parameters weighted by sample size.
Evaluate models using AUC-ROC and AUCPR on a test set derived from the distributed data.

Experimental results

Research questions

RQ1Does FADL improve predictive performance over traditional federated learning while maintaining privacy by not moving data?
RQ2Can balancing global (shared) and local (hospital-specific) training yield similar or better accuracy compared to centralized models?
RQ3How does FADL performance compare to centralized training and standard federated learning on ICU mortality prediction?

Key findings

Centralized learning achieved AUCROC 0.79 and AUCPR 0.21.
Original federated learning achieved AUCROC 0.75 and AUCPR 0.16.
FADL achieved AUCROC 0.79 and AUCPR 0.23.
FADL matches centralized performance on AUCROC and surpasses federated learning on AUCPR.
The balance between global and local training is identified as a key factor in distributed healthcare ML design.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.