[论文解读] SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model
这篇论文在腹部 CT 数据上评估 SAM 的零-shot 医学影像分割性能,使用点提示和边界框提示,并与 nnU-Net 基线进行比较,且讨论其在交互式半自动分割中的潜力。
Foundation models have taken over natural language processing and image generation domains due to the flexibility of prompting. With the recent introduction of the Segment Anything Model (SAM), this prompt-driven paradigm has entered image segmentation with a hitherto unexplored abundance of capabilities. The purpose of this paper is to conduct an initial evaluation of the out-of-the-box zero-shot capabilities of SAM for medical image segmentation, by evaluating its performance on an abdominal CT organ segmentation task, via point or bounding box based prompting. We show that SAM generalizes well to CT data, making it a potential catalyst for the advancement of semi-automatic segmentation tools for clinicians. We believe that this foundation model, while not reaching state-of-the-art segmentation performance in our investigations, can serve as a highly potent starting point for further adaptations of such models to the intricacies of the medical domain. Keywords: medical image segmentation, SAM, foundation models, zero-shot learning
研究动机与目标
- 评估 Segment Anything Model (SAM) 在医学 CT 数据上的开箱即用零-shot 分割能力。
- 研究不同视觉提示(点和边界框)如何影响 SAM 的分割准确性。
- 将 SAM 的零-shot 结果与强大自动基线(nnU-Net)在多器官腹部 CT 数据集上的表现进行比较。
- 确定边界框提示在 CT 强度范围内的鲁棒性。
- 为 SAM 在临床分割工作流程中的交互应用提供指导。
提出的方法
- 使用 AMOS22 腹部 CT 器官分割数据集的轴向 2D 切片作为评估数据。
- 通过创建评估提示: (i) 随机点提示(每个分割掩码 1、3、10 点)和 (ii) 在掩码周围抖动的边界框(0.01–0.5)。
- 使用 Dice 相似系数(DSC)与真实掩码进行分割精度计算。
- 将 SAM 的提示与同一切片上的 nnU-Net 2D 和 3D 基线进行比较。
实验结果
研究问题
- RQ1Can SAM perform zero-shot segmentation of unseen abdominal organs in CT images using simple prompts?
- RQ2How do point-based prompts compare to bounding box prompts in achieving accurate segmentations?
- RQ3Is bounding box prompting robust to intensity variations in CT data?
- RQ4How does SAM zero-shot performance compare to nnU-Net baselines?
- RQ5Can SAM accelerate interactive semi-automatic segmentation workflows in practice?
主要发现
| 方法 | 器官 | AVG | AVG* | 分割 | R.Kid. | L.Kid. | GallBl. | Esoph. | 肝 | 胃 | 主动脉 | 后腹 | 胰腺 | R.AG. | L.AG. | 十二指肠 | 膀胱 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 点 | 0.632 | 0.759 | 0.770 | 0.616 | 0.382 | 0.577 | 0.508 | 0.720 | 0.453 | 0.317 | 0.085 | 0.196 | 0.339 | 0.542 | 0.493 | 0.347 | |
| 3 点 | 0.733 | 0.784 | 0.786 | 0.683 | 0.448 | 0.658 | 0.577 | 0.758 | 0.493 | 0.343 | 0.129 | 0.240 | 0.325 | 0.631 | 0.542 | 0.397 | |
| 10 点 | 0.857 | 0.855 | 0.857 | 0.800 | 0.643 | 0.811 | 0.759 | 0.842 | 0.637 | 0.538 | 0.405 | 0.516 | 0.480 | 0.789 | 0.699 | 0.560 | |
| 框,0.01 | 0.926 | 0.884 | 0.889 | 0.883 | 0.820 | 0.902 | 0.823 | 0.924 | 0.867 | 0.727 | 0.618 | 0.754 | 0.811 | 0.909 | 0.838 | 0.826 | |
| 框,0.05 | 0.920 | 0.883 | 0.894 | 0.879 | 0.814 | 0.883 | 0.818 | 0.923 | 0.862 | 0.727 | 0.609 | 0.746 | 0.805 | 0.907 | 0.834 | 0.819 | |
| 框,0.1 | 0.890 | 0.870 | 0.874 | 0.859 | 0.806 | 0.813 | 0.796 | 0.919 | 0.845 | 0.702 | 0.594 | 0.733 | 0.785 | 0.862 | 0.810 | 0.795 | |
| 框,0.25 | 0.553 | 0.601 | 0.618 | 0.667 | 0.656 | 0.490 | 0.561 | 0.747 | 0.687 | 0.481 | 0.478 | 0.558 | 0.655 | 0.561 | 0.594 | 0.612 | |
| 框,0.5 | 0.202 | 0.275 | 0.257 | 0.347 | 0.356 | 0.164 | 0.252 | 0.381 | 0.335 | 0.239 | 0.234 | 0.308 | 0.343 | 0.205 | 0.278 | 0.289 |
- Box prompting (even with moderate jitter) yields high DSCs and is competitive with baselines.
- Single positive bounding boxes outperform multiple point prompts (e.g., 1 point vs 10 points).
- Performance remains robust on raw CT value ranges (AVG* similar to bounded prompts).
- SAM demonstrates strong zero-shot segmentation potential when paired with expert prompts in interactive workflows.
- While not reaching state-of-the-art fully automatic methods, SAM can speed up semi-automatic clinician workflows on most structures.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。