QUICK REVIEW

[论文解读] Reading Race: AI Recognises Patient's Racial Identity In Medical Images

Imon Banerjee, Ananth Reddy Bhimireddy|arXiv (Cornell University)|Jul 21, 2021

Artificial Intelligence in Healthcare and Education参考文献 13被引用 31

一句话总结

该论文显示深度学习模型能够从多模态的医疗影像中预测患者自报的种族，并进行外部验证，在放射科的部署中带来风险。

ABSTRACT

Background: In medical imaging, prior studies have demonstrated disparate AI performance by race, yet there is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images. Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race. Findings: Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities. Our findings hold under external validation conditions, as well as when models are optimized to perform clinically motivated tasks. We demonstrate this detection is not due to trivial proxies or imaging-related surrogate covariates for race, such as underlying disease distribution. Finally, we show that performance persists over all anatomical regions and frequency spectrum of the images suggesting that mitigation efforts will be challenging and demand further study. Interpretation: We emphasize that model ability to predict self-reported race is itself not the issue of importance. However, our findings that AI can trivially predict self-reported race -- even from corrupted, cropped, and noised medical images -- in a setting where clinical experts cannot, creates an enormous risk for all model deployments in medical imaging: if an AI model secretly used its knowledge of self-reported race to misclassify all Black patients, radiologists would not be able to tell using the same data the model has access to.

研究动机与目标

推动评估AI在医学影像中的种族偏倚表现。
量化深度学习模型从医疗影像中检测种族的能力。
评估对外部数据集及跨影像模态的泛化能力。
研究是否 race 检测依赖于混杂的解剖或表型特征。
探究AI模型识别种族的潜在机制。

提出的方法

在私有和公开的医疗影像数据集上训练标准深度学习模型以预测种族。
在多种影像模态和外部数据集上评估性能。
测试疾病分布和体型等混杂因素作为种族预测变量。
在针对临床驱动任务进行优化时评估模型性能。
调查AI模型识别种族的机制（不是由于简单代理变量）。
评估在解剖区域和频谱中的受损、裁切及噪声图像的鲁棒性。

实验结果

研究问题

RQ1深度学习模型是否能够在跨模态的医疗影像中准确预测患者自报的种族？
RQ2模型能否泛化到超出训练数据的外部环境？
RQ3模型预测是否由混杂的解剖或表型特征驱动，而非种族本身？
RQ4除了影像代理变量外，哪些机制使AI能够从医疗影像识别种族？
RQ5在解剖区域和图像频谱上，识别种族的能力有多持久？

主要发现

标准深度学习模型可以在多种影像模态下以高性能被训练成从医疗影像预测种族。
在外部验证条件下以及在针对临床驱动任务优化时，结果仍然成立。
检测并非由于简单代理或影像相关的替代协变量（如潜在疾病分布）等原因。
在所有解剖区域和图像的频谱上均有持续表现，表明缓解将具有挑战性。
本研究突显了重要的部署风险：若模型秘密使用种族信息，放射科医师无法仅凭同一数据检测到。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。