QUICK REVIEW

[論文レビュー] Cheap Lunch for Medical Image Segmentation by Fine-tuning SAM on Few Exemplars

Weijia Feng, Lingting Zhu|arXiv (Cornell University)|Aug 27, 2023

Advanced Neural Network Applications被引用数 8

ひとこと要約

The paperFine-tunes Segment Anything Model (SAM) on a few exemplar medical images using exemplar-guided data synthesis and Low-Rank Adaptation (LoRA) to achieve competitive segmentation with limited annotations on BraTS and Synapse datasets.

ABSTRACT

The Segment Anything Model (SAM) has demonstrated remarkable capabilities of scaled-up segmentation models, enabling zero-shot generalization across a variety of domains. By leveraging large-scale foundational models as pre-trained models, it is a natural progression to fine-tune SAM for specific domains to further enhance performances. However, the adoption of foundational models in the medical domain presents a challenge due to the difficulty and expense of labeling sufficient data for adaptation within hospital systems. In this paper, we introduce an efficient and practical approach for fine-tuning SAM using a limited number of exemplars, making it suitable for such scenarios. Our approach combines two established techniques from the literature: an exemplar-guided synthesis module and the widely recognized Low-Rank Adaptation (LoRA) fine-tuning strategy, serving as data-level and model-level attempts respectively. Interestingly, our empirical findings suggest that SAM can be effectively aligned within the medical domain even with few labeled data. We validate our approach through experiments on brain tumor segmentation (BraTS) and multi-organ CT segmentation (Synapse). The comprehensive results underscore the feasibility and effectiveness of such an approach, paving the way for the practical application of SAM in the medical domain.

研究の動機と目的

Motivate reducing labeling burden for medical image segmentation by adapting SAM with few exemplars.
Propose an exemplar-guided data synthesis module to generate synthetic training data.
Apply LoRA-based fine-tuning to keep trainable parameters small.
Evaluate the approach on BraTS 2018 brain tumor segmentation and Synapse multi-organ CT segmentation.
Demonstrate feasibility of cost-effective SAM adaptation in medical domains.

提案手法

Create synthetic training data by exemplar-guided synthesis with geometric and intensity transformations and background pasting.
Fine-tune SAM using Low-Rank Adaptation (LoRA) on both the image encoder and the mask decoder, updating 6.32M parameters (rank r=4).
Use point-based prompts during training to incorporate class prompts for all target organs/classes.
Optimize with a combined loss L = LCE + 0.8*LDice using AdamW with learning rate warm-up and decay.

実験結果

リサーチクエスチョン

RQ1Can SAM be effectively aligned to medical segmentation tasks with very few labeled exemplars?
RQ2Does exemplar-based data synthesis plus LoRA-based fine-tuning improve performance over zero-shot SAM and other baselines when labels are scarce?
RQ3What is the trade-off between annotation effort and segmentation accuracy in BraTS and Synapse datasets?
RQ4How does the approach perform with different exemplar counts (0.5%, 1%, 3%) and with/without data synthesis?

主な発見

Dataset	Methods	Exemplar Nums	DSC ↑	HD ↓
BraTS 2018	SAM (Zero-Shot)	-	45.29	54.74
BraTS 2018	SAMed (w/ Data Synthesis)	75 (0.5%)	82.80	28.03
BraTS 2018	SAMed (w/ Data Synthesis)	150 (1%)	82.50	43.99
BraTS 2018	SAMed (w/ Data Synthesis)	450 (3%)	85.53	17.56
BraTS 2018	Ours	75 (0.5%)	82.78	14.92
BraTS 2018	Ours	150 (1%)	83.40	10.03
BraTS 2018	Ours	450 (3%)	83.07	16.94
BraTS 2018	Full Set (Pseudo Upper Bound)	Total Nums	85.28	7.91
Synapse	SAM (Zero-Shot)	-	74.54	40.90
Synapse	SAMed	9 (one per two volumes)	43.82	96.21
Synapse	SAMed	18 (one per volume)	55.26	75.02
Synapse	SAMed	36 (two per volume)	66.96	44.69
Synapse	Ours	1 (one exemplar)	75.91	21.75
Synapse	Ours	9 (one per two volumes)	79.08	21.62
Synapse	Ours	18 (one per volume)	83.04	16.84
Synapse	Ours	36 (two per volume)	84.23	11.86
Synapse	Full Set	Total Nums	85.95	8.97

On BraTS 2018, the proposed method with few exemplars approaches full-set performance and outperforms zero-shot SAM and SAMed in HD95.
On Synapse, using synthesized data with few exemplars yields strong Dice and HD95 scores across multiple organs, often surpassing SAM without synthesis and approaching full-set performance.
Data synthesis consistently improves results over using exemplars alone across both datasets.
Using 1 exemplar with our method already outperforms SAM (ViT-H) zero-shot in some configurations; more exemplars improve performance further.
Training is feasible on modest hardware (RTX 3090 GPUs) with 6.32M trainable parameters (LoRA rank 4).

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。