QUICK REVIEW

[论文解读] Classification of Histopathological Biopsy Images Using Ensemble of Deep Learning Networks

Sara Hosseinzadeh Kassani, Peyman Hosseinzadeh Kassani|arXiv (Cornell University)|Sep 26, 2019

AI in cancer detection参考文献 38被引用 60

一句话总结

这篇论文提出一种将 VGG19、MobileNet、DenseNet201 三路径的预训练 CNN 的集成，采用迁移学习对乳腺组织病理图像进行二分类，在四个公开数据集上达到了最先进的结果。

ABSTRACT

Breast cancer is one of the leading causes of death across the world in women. Early diagnosis of this type of cancer is critical for treatment and patient care. Computer-aided detection (CAD) systems using convolutional neural networks (CNN) could assist in the classification of abnormalities. In this study, we proposed an ensemble deep learning-based approach for automatic binary classification of breast histology images. The proposed ensemble model adapts three pre-trained CNNs, namely VGG19, MobileNet, and DenseNet. The ensemble model is used for the feature representation and extraction steps. The extracted features are then fed into a multi-layer perceptron classifier to carry out the classification task. Various pre-processing and CNN tuning techniques such as stain-normalization, data augmentation, hyperparameter tuning, and fine-tuning are used to train the model. The proposed method is validated on four publicly available benchmark datasets, i.e., ICIAR, BreakHis, PatchCamelyon, and Bioimaging. The proposed multi-model ensemble method obtains better predictions than single classifiers and machine learning algorithms with accuracies of 98.13%, 95.00%, 94.64% and 83.10% for BreakHis, ICIAR, PatchCamelyon and Bioimaging datasets, respectively.

研究动机与目标

解决来自多个来源的异质性病理图像数据挑战。
在不使用手工特征的情况下开发鲁棒的良恶性乳腺组织图像二分类器。
利用迁移学习和数据增强以提高跨数据集的泛化能力。
在公开基准数据集上将集成模型与单一 CNNs 以及传统机器学习方法进行对比评估。

提出的方法

三路径集成架构，结合 VGG19、MobileNetV2 与 DenseNet201 进行特征提取。
使用来自 ImageNet 预训练的 CNN 进行迁移学习，并进行微调以输出两类。
将最终层展平并连接，形成多视角特征向量，送入多层感知机分类器。
预处理包括染色归一化（Macenko）、图像归一化，以及数据增强（翻转、旋转、缩放等）。
训练细节：图像尺寸调整为 224x224，批量大小 32，训练 1000 轮，Adam 优化器， dropout 0.5，256-隐藏神经元的全连接层。

实验结果

研究问题

RQ1三路径 CNN 集成是否能在多数据集上相较单一 CNNs 与传统 ML 方法提升二分类乳腺病理的准确性？
RQ2使用染色归一化与数据增强的迁移学习对对异质数据集（BreakHis、ICIAR、PatchCamelyon、Bioimaging）的泛化能力有何影响？
RQ3与单独架构相比，在公开基准上集成模型的准确性及其他指标如何？

主要发现

数据集	准确率	精度	召回率	F-score
BreakHis	98.13%	98.75%	98.54%	98.64%
PatchCamelyon*	94.64%	95.70%	95.27%	95.50%
ICIAR	95.00%	95.91%	94.00%	94.94%
Bioimaging	83.10%	92.60%	71.42%	80.64%

在 BreakHis 上，集成达到 98.13% 准确度，98.75% 精度，98.54% 召回，98.64% F-score。
在 PatchCamelyon* 上，集成达到 94.64% 准确度，95.70% 精度，95.27% 召回，95.50% F-score。
在 ICIAR 上，集成达到 95.00% 准确度，95.91% 精度，94.00% 召回，94.94% F-score。
在 Bioimaging 上，集成达到 83.10% 准确度，92.60% 精度，71.42% 召回，80.64% F-score。
单独 CNNs（VGG19、MobileNetV2、DenseNet201）及其他最先进 CNNs 通常在 BreakHis、ICIAR、PatchCamelyon* 数据集上未能超越集成；Bioimaging 在整体性能较低仍具挑战性。
与若干机器学习模型（决策树、随机森林、XGBoost、AdaBoost、Bagging）相比，集成在大多数数据集上表现更优，尽管某些文献方法在特定数据集上取得更高的准确率（如 Pratiher & Chattoraj 2019 于 BreakHis）。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。