QUICK REVIEW

[論文レビュー] SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model

Saikat Roy, Tassilo Wald|arXiv (Cornell University)|Apr 10, 2023

Artificial Intelligence in Healthcare and Education被引用数 64

ひとこと要約

本稿は SAM のゼロショット医療画像分割性能を、腹部CTデータ上で点プロンプトとバウンディングボックスプロンプトを用いて評価し、nnU-Net のベースラインと比較し、対話的な半自動分割の可能性について論じる。

ABSTRACT

Foundation models have taken over natural language processing and image generation domains due to the flexibility of prompting. With the recent introduction of the Segment Anything Model (SAM), this prompt-driven paradigm has entered image segmentation with a hitherto unexplored abundance of capabilities. The purpose of this paper is to conduct an initial evaluation of the out-of-the-box zero-shot capabilities of SAM for medical image segmentation, by evaluating its performance on an abdominal CT organ segmentation task, via point or bounding box based prompting. We show that SAM generalizes well to CT data, making it a potential catalyst for the advancement of semi-automatic segmentation tools for clinicians. We believe that this foundation model, while not reaching state-of-the-art segmentation performance in our investigations, can serve as a highly potent starting point for further adaptations of such models to the intricacies of the medical domain. Keywords: medical image segmentation, SAM, foundation models, zero-shot learning

研究の動機と目的

評価 Segment Anything Model (SAM) の医療CTデータに対するアウト・オブ・ザ・ボックスのゼロショット分割能力。
異なる視覚的プロンプト（点とバウンディングボックス）が SAM の分割精度に与える影響を調査。
SAM のゼロショット結果を、同一スライス上の強力な自動ベースライン（nnU-Net）と比較。
CT 強度範囲を横断するバウンディングボックス prompting のロバスト性を評価。
臨床セグメンテーションワークフローにおける SAM の対話的応用に関する指針を提供。

提案手法

評価データとして AMOS22 腹部 CT 臓器セグメンテーションデータセットの軸位 2D スライスを用いる。
評価プロンプトを作成する： (i) ランダムなポイントプロンプト（各セグメンテーションマスクにつき 1, 3, 10 点）および (ii) マスクの周囲にあるジッター付きバウンディングボックス（0.01–0.5）。
実測マスクに対して Dice Similarity Coefficient (DSC) を用いて分割精度を計算する。
同じスライス上で SAM のプロンプトを nnU-Net の 2D および 3D ベースラインと比較する。

実験結果

リサーチクエスチョン

RQ1Can SAM perform zero-shot segmentation of unseen abdominal organs in CT images using simple prompts?
RQ2How do point-based prompts compare to bounding box prompts in achieving accurate segmentations?
RQ3Is bounding box prompting robust to intensity variations in CT data?
RQ4How does SAM zero-shot performance compare to nnU-Net baselines?
RQ5Can SAM accelerate interactive semi-automatic segmentation workflows in practice?

主な発見

Method	AVG	AVG*	Spl.	R.Kid.	L.Kid.	GallBl.	Esoph.	Liver	Stom.	Aorta	Postc.	Pancr.	R.AG.	L.AG.	Duod.	Blad.
1 Point	0.632	0.759	0.770	0.616	0.382	0.577	0.508	0.720	0.453	0.317	0.085	0.196	0.339	0.542	0.493	0.347
3 Points	0.733	0.784	0.786	0.683	0.448	0.658	0.577	0.758	0.493	0.343	0.129	0.240	0.325	0.631	0.542	0.397
10 Points	0.857	0.855	0.857	0.800	0.643	0.811	0.759	0.842	0.637	0.538	0.405	0.516	0.480	0.789	0.699	0.560
Boxes, 0.01	0.926	0.884	0.889	0.883	0.820	0.902	0.823	0.924	0.867	0.727	0.618	0.754	0.811	0.909	0.838	0.826
Boxes, 0.05	0.920	0.883	0.894	0.879	0.814	0.883	0.818	0.923	0.862	0.727	0.609	0.746	0.805	0.907	0.834	0.819
Boxes, 0.1	0.890	0.870	0.874	0.859	0.806	0.813	0.796	0.919	0.845	0.702	0.594	0.733	0.785	0.862	0.810	0.795
Boxes, 0.25	0.553	0.601	0.618	0.667	0.656	0.490	0.561	0.747	0.687	0.481	0.478	0.558	0.655	0.561	0.594	0.612
Boxes, 0.5	0.202	0.275	0.257	0.347	0.356	0.164	0.252	0.381	0.335	0.239	0.234	0.308	0.343	0.205	0.278	0.289

Box prompting (even with moderate jitter) yields high DSCs and is competitive with baselines.
Single positive bounding boxes outperform multiple point prompts (e.g., 1 point vs 10 points).
Performance remains robust on raw CT value ranges (AVG* similar to bounded prompts).
SAM demonstrates strong zero-shot segmentation potential when paired with expert prompts in interactive workflows.
While not reaching state-of-the-art fully automatic methods, SAM can speed up semi-automatic clinician workflows on most structures.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。