[论文解读] Deep Convolutional Neural Networks to Diagnose COVID-19 and other Pneumonia Diseases from Posteroanterior Chest X-Rays
本文评估多种CNN架构(VGG16/19、InceptionResNetV2、InceptionV3、Xception),在PA胸片上训练,用于将COVID-19、No Finding和Other Pneumonia进行分类,其中VGG16表现最好。
The article explores different deep convolutional neural network architectures trained and tested on posteroanterior chest X-rays of 327 patients who are healthy (152 patients), diagnosed with COVID-19 (125), and other types of pneumonia (48). In particular, this paper looks at the deep convolutional neural networks VGG16 and VGG19, InceptionResNetV2 and InceptionV3, as well as Xception, all followed by a flat multi-layer perceptron and a final 30% drop-out. The paper has found that the best performing network is VGG16 with a final $30$% drop-out trained over 3 classes (COVID-19, No Finding, Other Pneumonia). It has an internal cross-validated accuracy of $93.9(\pm3.4)$%, a COVID-19 sensitivity of $87.7(-1.9,+2)$%, and a No Finding sensitivity of $96.8(\pm0.8)$%. The respective external cross-validated values are $84.1(\pm13.5)$%, $87.7(-1.9,2)$%, and $96.8(\pm0.8)$%. The model optimizer was Adam with a 1e-4 learning rate, and categorical cross-entropy loss. It is hoped that, once this research will be put to practice in hospitals, healthcare professionals will be able in the medium to long-term to diagnosing through machine learning tools possible pneumonia, and if detected, whether it is linked to a COVID-19 infection, allowing the detection of new possible COVID-19 foyers after the end of possible "stop-and-go" lockdowns as expected by until a vaccine is found and widespread. Furthermore, in the short-term, it is hoped practitioners can compare the diagnosis from the deep convolutional neural networks with possible RT-PCR testing results, and if clashing, a Computed Tomography could be performed as they are more accurate in showing COVID-19 pneumonia.
研究动机与目标
- 评估深度CNN在PA胸片上诊断COVID-19及其他肺炎的可行性。
- 在三分类任务中比较多种架构(VGG16、VGG19、InceptionResNetV2、InceptionV3、Xception)的性能。
- 提供交叉验证性能指标并讨论在实际医院中的使用与局限性。
提出的方法
- 通过使用ImageNet预训练权重初始化网络来进行迁移学习。
- 使用Adam优化器、学习率0.0001,将所有模型训练200轮。
- 应用最终30% dropout和分类交叉熵损失。
- 使用分层的5折交叉验证来报告内部和外部测试性能。
- 通过将PA胸片调整为182x182并应用数据增强(旋转、平移、缩放、翻转、通道位移)来预处理。
实验结果
研究问题
- RQ1基于PA胸片的CNN能否将COVID-19与No Finding及其他肺炎区分开来?
- RQ2在这一三分类任务中,哪种CNN架构能取得最佳的交叉验证性能?
- RQ3各架构在检测COVID-19和No Finding时的灵敏度是多少?
- RQ4外部测试性能与内部交叉验证的对比如何?
- RQ5在医院环境中使用PA X光和这些模型有哪些局限性?
主要发现
- VGG16在内部和外部数据集中均取得了被评估模型中的最佳整体性能。
- VGG16的内部5折交叉验证准确率:93.9%(±3.4);COVID-19召回率:87.7%(±2.0);No Finding召回率:96.8%(±0.8)。
- VGG16的外部测试准确率:84.1%(±13.5);COVID-19召回率:87.7%(±2.0);No Finding召回率:96.8%(±0.8)。
- InceptionResNetV2和InceptionV3显示中等性能,COVID-19灵敏度较低且方差较大;InceptionResNetV2的COVID-19召回率在约70.8–71.0%(内部/外部)。
- VGG19的性能与VGG16相似,但置信度提升略小且成本更高。
- 在该数据集中,Xception的表现不及基于VGG的模型。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。