QUICK REVIEW

[論文レビュー] SAMAug: Point Prompt Augmentation for Segment Anything Model

Haixing Dai, Chong Ma|arXiv (Cornell University)|Jul 3, 2023

Visual Attention and Saliency Detection被引用数 25

ひとこと要約

SAMAug は初期の SAM 出力から追加のプロンプトをサンプリングして点プロンプトを自動で拡張し、追加の手入力なしに複数データセットでセグメンテーションを改善します。

ABSTRACT

This paper introduces SAMAug, a novel visual point augmentation method for the Segment Anything Model (SAM) that enhances interactive image segmentation performance. SAMAug generates augmented point prompts to provide more information about the user's intention to SAM. Starting with an initial point prompt, SAM produces an initial mask, which is then fed into our proposed SAMAug to generate augmented point prompts. By incorporating these extra points, SAM can generate augmented segmentation masks based on both the augmented point prompts and the initial prompt, resulting in improved segmentation performance. We conducted evaluations using four different point augmentation strategies: random sampling, sampling based on maximum difference entropy, maximum distance, and saliency. Experiment results on the COCO, Fundus, COVID QUEx, and ISIC2018 datasets show that SAMAug can boost SAM's segmentation results, especially using the maximum distance and saliency. SAMAug demonstrates the potential of visual prompt augmentation for computer vision. Codes of SAMAug are available at github.com/yhydhx/SAMAug

研究の動機と目的

Motivate improving SAM segmentation by addressing prompt ambiguity with augmented prompts.
Propose a visual point augmentation framework that generates additional prompts from initial SAM results.
Evaluate four augmentation strategies and box-prompt variants across diverse datasets (general and medical).
Demonstrate that prompt augmentation can improve SAM performance without retraining or data changes.

提案手法

Define SAMAug as a pipeline that takes an initial SAM segmentation and samples augmented point prompts from that result.
Implement four sampling strategies: random, maximum difference entropy, maximum distance, and saliency-based.
Optionally explore inner/outer box prompts derived from GT or initial results to assess box-based augmentation.
Evaluate augmentation effects on SAM across COCO, Fundus, COVID QU-Ex, and ISIC2018 datasets.
Provide implementation details and report Dice score gains.
Discuss invariance in prompt selection and potential for active-learning integration.

Figure 1: The framework of our SAMAug model.

実験結果

リサーチクエスチョン

RQ1Can automatic visual point augmentation improve SAM segmentation without additional user input or model retraining?
RQ2Which point-prompt augmentation strategies (random, max entropy, max distance, saliency) most effectively improve SAM performance across diverse domains?
RQ3How do box-based prompts compare to point prompts in augmented SAM performance across datasets?
RQ4What are the limitations and dataset-specific effects of SAMAug, and how might active learning further enhance prompt augmentation?

主な発見

Dataset	Initial	Random	Max Entropy	Max Distance	Saliency	GT Random	GT Max Entropy	GT Max Distance
COCO	0.601 ± 0.002	0.614 ± 0.002	0.621 ± 0.004	0.651 ± 0.002	0.631 ± 0.001	0.794 ± 0.005	0.781 ± 0.004	0.797 ± 0.007
Fundus	0.766 ± 0.008	0.794 ± 0.007	0.791 ± 0.006	0.802 ± 0.007	0.792 ± 0.002	0.840 ± 0.008	0.796 ± 0.010	0.849 ± 0.010
COVID QU-Ex	0.488 ± 0.003	0.503 ± 0.003	0.490 ± 0.003	0.497 ± 0.002	0.495 ± 0.002	0.556 ± 0.001	0.526 ± 0.002	0.454 ± 0.004
ISIC2018	0.662 ± 0.009	0.688 ± 0.014	0.687 ± 0.007	0.668 ± 0.011	0.739 ± 0.018	0.797 ± 0.001	0.773 ± 0.007	0.701 ± 0.003

SAMAug improves SAM performance across datasets, with gains of about 0.01–0.05 Dice on COCO using augmented prompts.
On Fundus, SAMAug yields approximately 0.03–0.04 Dice improvement, with Max Distance often best.
On COVID QU-Ex, augmented prompts improve Dice by around 0.01 over the initial SAM result.
On ISIC2018, SAMAug increases Dice by 0.02–0.07, with Saliency sometimes providing the largest gain.
Box prompts generally outperform point prompts when the outer GT box is available, achieving higher Dice scores (e.g., COCO 0.89, Fundus 0.904) compared to point-based augmentation.
Ground-truth-based augmentation bounds (GT Random/Max Entropy/Max Distance) show potential upper limits for two-prompt setups.

Figure 2: Sample segmentation results by SAM with different point prompt augmentation strategies. Column "GT" shows the ground truth segmentation mask. Column "SAM" is the segmentation result using a single point prompt.

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。