Skip to main content
QUICK REVIEW

[论文解读] STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

Ziyan Huang, Haoyu Wang|arXiv (Cornell University)|Apr 13, 2023
COVID-19 diagnosis using AI被引用 49
一句话总结

STU-Net 引入可扩展的 U-Net 变体,参数规模达 1.4B,在 TotalSegmentator 上进行预训练,并展示出对 14 个下游数据集和微调场景的强传输性。

ABSTRACT

Large-scale models pre-trained on large-scale datasets have profoundly advanced the development of deep learning. However, the state-of-the-art models for medical image segmentation are still small-scale, with their parameters only in the tens of millions. Further scaling them up to higher orders of magnitude is rarely explored. An overarching goal of exploring large-scale models is to train them on large-scale medical segmentation datasets for better transfer capacities. In this work, we design a series of Scalable and Transferable U-Net (STU-Net) models, with parameter sizes ranging from 14 million to 1.4 billion. Notably, the 1.4B STU-Net is the largest medical image segmentation model to date. Our STU-Net is based on nnU-Net framework due to its popularity and impressive performance. We first refine the default convolutional blocks in nnU-Net to make them scalable. Then, we empirically evaluate different scaling combinations of network depth and width, discovering that it is optimal to scale model depth and width together. We train our scalable STU-Net models on a large-scale TotalSegmentator dataset and find that increasing model size brings a stronger performance gain. This observation reveals that a large model is promising in medical image segmentation. Furthermore, we evaluate the transferability of our model on 14 downstream datasets for direct inference and 3 datasets for further fine-tuning, covering various modalities and segmentation targets. We observe good performance of our pre-trained model in both direct inference and fine-tuning. The code and pre-trained models are available at https://github.com/Ziyan-Huang/STU-Net.

研究动机与目标

  • 推动可扩展、可迁移的医疗影像分割模型,能够处理多模态和多目标。
  • 通过改进 nnU-Net 以提升可扩展性和可迁移性,开发 STU-Net 变体。
  • 在大规模医学分割数据集上进行预训练,以提升对下游任务的迁移能力。
  • 评估直接推理和微调在不同数据集和模态上的迁移性。

提出的方法

  • 使用残差连接对 nnU-Net 块进行改进,以实现更深的体系结构。
  • 用无权重插值加上 1x1x1 卷积替换基于转置的上采样,以提升迁移性。
  • 固定架构超参数(如阶段数、等方性核等)以在不同任务中保持迁移性。
  • 以复合方式联合缩放深度和宽度,逐步生成 STU-Net-S、STU-Net-B、STU-Net-L、STU-Net-H,参数规模递增。
  • 在 TotalSegmentator CT 数据集(104 个器官,1204 个体积)上进行 4000 轮的预训练,并使用镜像增强。
  • 在下游数据集上进行微调或直接推断,如有需要进行通道适应。

实验结果

研究问题

  • RQ1STU-Net 是否能通过在大规模医学分割数据上联合缩放深度和宽度来实现可扩展的性能提升?
  • RQ2通过移除特定任务的上采样(通过无权重插值)是否会提升跨模态和任务的迁移性?
  • RQ3在 TotalSegmentator 上进行的大规模监督式预训练如何影响在多样化下游数据集上的迁移性能?
  • RQ4在多份 CT/MR/PET 数据集上,直接推断与微调之间的迁移效果差异有哪些?

主要发现

  • STU-Net-H(深度 3 倍,宽度 3 倍)达到 1.4B 参数,并在 TotalSegmentator 类别上取得最高的平均 Dice 相似系数(DSC)。
  • STU-Net-B 在 TotalSegmentator 的平均 DSC 超越 nnU-Net 和 SwinUNETR-B,当扩展到 STU-Net-L 和 STU-Net-H 时,增益更大。
  • 经过预训练的 STU-Net 模型在直接推断下对 14 个下游 CT 数据集的迁移效果良好,通常较大模型达到更高的平均 DSC。
  • 在三个下游数据集(包括 AutoPET)上对 STU-Net-H-ft 进行微调,获得最佳的平均 DSC,超越基线。
  • 架构改进(残差块、无权重上采样)以及复合缩放,在类似计算量下始终优于 nnU-Net 变体。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。