QUICK REVIEW

[论文解读] Transfusion: Understanding Transfer Learning for Medical Imaging

Maithra Raghu, Chiyuan Zhang|arXiv (Cornell University)|Feb 14, 2019

COVID-19 diagnosis using AI参考文献 40被引用 642

一句话总结

传输学习自 ImageNet 对两项大型医学影像任务提供有限增益；更小的轻量模型表现相当，且显著的特征重用仅限于最低层，附加的权重缩放有助于收敛。

ABSTRACT

Transfer learning from natural image datasets, particularly ImageNet, using standard large models and corresponding pretrained weights has become a de-facto method for deep learning applications to medical imaging. However, there are fundamental differences in data sizes, features and task specifications between natural image classification and the target medical tasks, and there is little understanding of the effects of transfer. In this paper, we explore properties of transfer learning for medical imaging. A performance evaluation on two large scale medical imaging tasks shows that surprisingly, transfer offers little benefit to performance, and simple, lightweight models can perform comparably to ImageNet architectures. Investigating the learned representations and features, we find that some of the differences from transfer learning are due to the over-parametrization of standard models rather than sophisticated feature reuse. We isolate where useful feature reuse occurs, and outline the implications for more efficient model exploration. We also explore feature independent benefits of transfer arising from weight scalings.

研究动机与目标

激励并评估从自然图像到医学影像任务的传递学习的有效性。
在两个大型医学数据集上，将标准的 ImageNet 架构与更小的轻量模型进行比较。
分析学习到的表示，理解特征重用，并识别传送在哪些方面有帮助。
研究预训练权重在不依赖特征重用的情况下对收敛性的好处。
提出平衡性能与计算效率的混合传输策略。

提出的方法

评估多种架构（ResNet-50、Inception-v3，以及一系列称为 CBR 的轻量CNN）在随机初始化和 ImageNet 预训练下的表现。
使用两个大型医学数据集（视网膜 Fundus 图像和 CheXpert 胸部 X 线）并在各任务上测量 AUC-ROC。
使用 SVCCA 分析隐藏表示，以评估微调前后的表示相似性。
通过重用预训练权重的子集并重新设计网络顶部来进行权重传输实验。
通过权重缩放（Mean Var 初始化）考察传输的与特征无关的好处，以研究收敛速度。
可视化前几层滤波器，了解预训练特征在训练过程中的适应方式。

实验结果

研究问题

RQ1与随机初始化相比，ImageNet 的迁移学习是否改善医学影像任务的表现？
RQ2轻量化架构是否能够在医学任务上匹配或超越 ImageNet 类架构？
RQ3在医学影像模型中，传入的特征重用发生在网络的何处？
RQ4预训练权重是否在不依赖特征重用的情况下影响收敛速度？
RQ5混合传输策略是否能够在保持收益的同时实现更灵活的模型设计？

主要发现

迁移学习在两项医学任务和各种架构上都提供有限的性能提升。
更小更简单的 CNN（CBR）在 Retina 和 CheXpert 任务上的表现与标准 ImageNet 模型相当。
ImageNet 的 top-5 准确度并不能预测医学任务的表现。
表示分析显示大型模型在训练中变化较小，特征重用主要限于最低层。
通过权重缩放实现的传输无关收益存在，能够加速收敛（Mean Var 初始化）。
混合方法（仅重用最低层，或结合轻量顶层改造，或使用合成的 conv1 特征）可以在保持灵活性的同时达到完全迁移学习的性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。