QUICK REVIEW

[论文解读] Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning-Based Approach

Sara Hosseinzadeh Kassani, Peyman Hosseinzadeh Kassasni|arXiv (Cornell University)|Apr 22, 2020

COVID-19 diagnosis using AI参考文献 41被引用 62

一句话总结

本论文评估基于迁移学习的深度CNN特征提取器集合，随后再结合传统ML分类器，从胸部X光和CT影像中检测COVID-19，在公开数据集上达到高达99%的准确率。

ABSTRACT

The newly identified Coronavirus pneumonia, subsequently termed COVID-19, is highly transmittable and pathogenic with no clinically approved antiviral drug or vaccine available for treatment. The most common symptoms of COVID-19 are dry cough, sore throat, and fever. Symptoms can progress to a severe form of pneumonia with critical complications, including septic shock, pulmonary edema, acute respiratory distress syndrome and multi-organ failure. While medical imaging is not currently recommended in Canada for primary diagnosis of COVID-19, computer-aided diagnosis systems could assist in the early detection of COVID-19 abnormalities and help to monitor the progression of the disease, potentially reduce mortality rates. In this study, we compare popular deep learning-based feature extraction frameworks for automatic COVID-19 classification. To obtain the most accurate feature, which is an essential component of learning, MobileNet, DenseNet, Xception, ResNet, InceptionV3, InceptionResNetV2, VGGNet, NASNet were chosen amongst a pool of deep convolutional neural networks. The extracted features were then fed into several machine learning classifiers to classify subjects as either a case of COVID-19 or a control. This approach avoided task-specific data pre-processing methods to support a better generalization ability for unseen data. The performance of the proposed method was validated on a publicly available COVID-19 dataset of chest X-ray and CT images. The DenseNet121 feature extractor with Bagging tree classifier achieved the best performance with 99% classification accuracy. The second-best learner was a hybrid of the a ResNet50 feature extractor trained by LightGBM with an accuracy of 98%.

研究动机与目标

展示一种通用的、非手工设计特征提取方式，使用深度CNN对X光与CT影像进行COVID-19分类。
评估从大规模预训练CNN迁移知识是否在有限的COVID-19数据下提升检测效果。
避免大量预处理，以促进在异构成像源上的泛化。
确定CNN特征提取器与ML分类器的组合，以最大化准确性和效率。
提供一个基于网络的计算机辅助诊断工具，用于快速筛查疑似病例。

提出的方法

使用预训练的CNN架构（MobileNet、DenseNet、Xception、InceptionV3、InceptionResNetV2、ResNet、VGGNet、NASNet）作为特征提取器（迁移学习），将图像编码为低维特征向量。
在CNN得到的特征上训练多个传统ML分类器（决策树、随机森林、XGBoost、AdaBoost、Bagging、LightGBM）。
在胸部X光和CT影像的公开数据集上使用10折交叉验证进行评估；权重从ImageNet初始化。
进行两步最小预处理：调整到统一尺寸和图像归一化（ImageNet均值减法与最小-最大归一化）。
报告准确率、精确度、召回率和F1分数以比较CNN+ML组合。

实验结果

研究问题

RQ1深度CNN提取的特征表示结合ML分类器在X光和CT影像中检测COVID-19相对于健康对照的效果如何？
RQ2哪些CNN架构与ML分类器在该任务中能获得最高的准确性与可靠性？
RQ3最小预处理与迁移学习对异构成像源是否提供鲁棒的泛化？
RQ4在实际CAD部署中，特征提取与分类器训练之间的计算时间权衡是什么？

主要发现

DT	RF	XGBoost	AdaBoost	Bagging	LightGBM
83.00 ± 0.26	93.00 ± 0.23	95.00 ± 0.16	80.00 ± 0.17	96.00 ± -0.11	82.00 ± 0.28
92.00 ± 0.15	90.00 ± 0.21	94.00 ± 0.16	92.00 ± 0.19	99.00 ± 0.07	96.00 ± 0.11
84.00 ± 0.26	90.00 ± 0.24	90.00 ± 0.18	87.00 ± 0.25	96.00 ± 0.11	87.00 ± 0.17
95.00 ± 0.17	90.00 ± 0.19	96.00 ± 0.11	93.00 ± 0.20	96.00 ± 0.11	96.00 ± 0.11
82.00 ± 0.22	84.00 ± 0.29	88.00 ± 0.15	80.00 ± 0.12	95.00 ± 0.12	84.00 ± 0.16
84.00 ± 0.31	93.00 ± 0.16	93.00 ± 0.19	87.00 ± 0.33	94.00 ± 0.12	88.00 ± 0.21
89.00 ± 0.17	90.00 ± 0.15	93.00 ± 0.16	94.00 ± 0.12	93.00 ± 0.16	98.00 ± 0.09
93.00 ± 0.12	92.00 ± 0.16	93.00 ± 0.16	94.00 ± 0.17	91.00 ± 0.22	93.00 ± 0.20
90.00 ± 0.19	91.00 ± 0.19	88.00 ± 0.19	90.00 ± 0.19	90.00 ± 0.19	85.00 ± 0.19
82.00 ± 0.23	88.00 ± 0.19	89.00 ± 0.17	81.00 ± 0.23	93.00 ± 0.19	82.00 ± 0.26
87.00 ± 0.17	88.00 ± 0.22	94.00 ± 0.19	87.00 ± 0.17	93.00 ± 0.19	89.00 ± 0.17
86.00 ± 0.26	?	?	?	?	?
87.00 ± 0.12	96.00 ± 0.11	92.00 ± 0.19	90.00 ± 0.18	95.00 ± 0.12	88.00 ± 0.10
79.00 ± 0.32	89.00 ± 0.24	89.00 ± 0.28	76.00 ± 0.32	95.00 ± 0.12	78.00 ± 0.26
90.00 ± 0.27	86.00 ± 0.26	93.00 ± 0.16	89.00 ± 0.20	96.00 ± 0.11	88.00 ± 0.28

最佳结果使用DenseNet121特征与Bagging分类器，准确率为99.00%（±0.09）。
近似第二的是DenseNet121特征（或ResNet50特征）搭配LightGBM，准确率98.00%（±0.09）。
在使用Bagging分类器时，MobileNet和InceptionV3特征实现了最高的精确率、召回率和F1分数（均为99.00%）。
总体结果显示，深度CNN特征（DenseNet121/DenseNet201/MobileNet/Xception/InceptionV3）配Bagging或XGBoost分类器，在所提供数据集上优于其他若干CNN+ML组合。
特征提取与训练时间表明该方法比从零开始训练非常深的CNN更快，有望实现CAD系统的近实时推理。
已实现一个基于网络的检测工具，用以模拟临床筛查流程，尽管仍需临床验证。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。