QUICK REVIEW

[论文解读] High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks

Krzysztof J. Geras, Stacey Wolfson|arXiv (Cornell University)|Mar 21, 2017

AI in cancer detection参考文献 3被引用 195

一句话总结

作者开发了一个多视图深度卷积网络（MV-DCN），它处理四个高分辨率的乳腺X线视图以预测 BI-RADS 类别，展示了数据规模和分辨率的重要性，以及在读者研究中的竞争性表现。

ABSTRACT

Advances in deep learning for natural images have prompted a surge of interest in applying similar techniques to medical images. The majority of the initial attempts focused on replacing the input of a deep convolutional neural network with a medical image, which does not take into consideration the fundamental differences between these two types of images. Specifically, fine details are necessary for detection in medical images, unlike in natural images where coarse structures matter most. This difference makes it inadequate to use the existing network architectures developed for natural images, because they work on heavily downscaled images to reduce the memory requirements. This hides details necessary to make accurate predictions. Additionally, a single exam in medical imaging often comes with a set of views which must be fused in order to reach a correct conclusion. In our work, we propose to use a multi-view deep convolutional neural network that handles a set of high-resolution medical images. We evaluate it on large-scale mammography-based breast cancer screening (BI-RADS prediction) using 886,000 images. We focus on investigating the impact of the training set size and image size on the prediction accuracy. Our results highlight that performance increases with the size of training set, and that the best performance can only be achieved using the original resolution. In the reader study, performed on a random subset of the test set, we confirmed the efficacy of our model, which achieved performance comparable to a committee of radiologists when presented with the same data.

研究动机与目标

研究如何将深度学习应用于全分辨率、多视图乳腺X线摄影而不进行下采样。
评估训练集规模对 BI-RADS 预测性能的影响。
评估输入图像分辨率对模型准确性的影响。
可视化模型决策并在读者研究中与放射科医生的表现进行比较。

提出的方法

开发一个具有四个视图专用列（L-CC、R-CC、L-MLO、R-MLO）的 MV-DCN，其视图表示被级联用于最终分类。
使用专用的卷积-池化堆栈处理每个视图，并在拼接前应用全局平均池化。
端到端训练，左右视图对之间共享权重，进行数据增强、输入噪声和 dropout；使用 Adam 在大尺寸高分辨率输入（2600x2000）上优化。
在早期层进行积极的初始下采样以使高分辨率输入可行，同时在拼接前对最终特征图进行平均以降低维度。

实验结果

研究问题

RQ1与下采样方法相比，保持高分辨率输入是否提升 BI-RADS 预测？
RQ2训练数据量大小如何影响 MV-DCN 在 BI-RADS 分类上的表现？
RQ3输入分辨率对 BI-RADS 类别的预测准确性有何影响？
RQ4MV-DCN 的表现与放射科医生以及放射科医生与 MV-DCN 的集成相比如何？
RQ5能否利用置信度（预测熵）来识别高准确性预测的子集？

主要发现

随着训练数据量的增加而性能提升（macAUC 从 1% 增长到 100% 时提升）。
下采样输入会降低性能；全分辨率输入可获得最佳结果（例如，全输入的 macAUC 高于缩放输入）。
更高置信度的预测（较低熵）与更高准确性相关（在高置信子集上 HC-macAUC 超过 macAUC）。
在读者研究中，放射科医生的 macAUC 为 0.704，MV-DCN 为 0.688，放射科医生与 MV-DCN 的集成为 0.735。
四视图输入的 MV-DCN 能在同一数据集上达到委员会级放射科医生在 BI-RADS 预测上的表现。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。