QUICK REVIEW

[論文レビュー] SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective

An Wang, Mobarakol Islam|arXiv (Cornell University)|Apr 28, 2023

Artificial Intelligence in Healthcare and Education被引用数 12

ひとこと要約

SAM はロボット手術器具のセグメンテーションにおける境界ボックスプロンプトを用いたゼロショット一般化が強い一方、ポイントプロンプトと未プロンプト設定では性能が劣り、さまざまなノイズに対して頑健性が不足している。

ABSTRACT

Segment Anything Model (SAM) is a foundation model for semantic segmentation and shows excellent generalization capability with the prompts. In this empirical study, we investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery in various settings of (i) prompted vs. unprompted; (ii) bounding box vs. points-based prompt; (iii) generalization under corruptions and perturbations with five severity levels; and (iv) state-of-the-art supervised model vs. SAM. We conduct all the observations with two well-known robotic instrument segmentation datasets of MICCAI EndoVis 2017 and 2018 challenges. Our extensive evaluation results reveal that although SAM shows remarkable zero-shot generalization ability with bounding box prompts, it struggles to segment the whole instrument with point-based prompts and unprompted settings. Furthermore, our qualitative figures demonstrate that the model either failed to predict the parts of the instrument mask (e.g., jaws, wrist) or predicted parts of the instrument as different classes in the scenario of overlapping instruments within the same bounding box or with the point-based prompt. In fact, it is unable to identify instruments in some complex surgical scenarios of blood, reflection, blur, and shade. Additionally, SAM is insufficiently robust to maintain high performance when subjected to various forms of data corruption. Therefore, we can argue that SAM is not ready for downstream surgical tasks without further domain-specific fine-tuning.

研究の動機と目的

境界ボックスとポイントプロンプトでのゼロショット分割性能を評価する。
EndoVis 2017および2018データセット全体でデータ破損や撹乱に対するSAMの頑健性を評価する。
未プロンプトの手術シーンにおけるSAMの自動マスク生成を調査する。
二値分割および機器別タスクにおける従来の監視型分割法との比較を行う。

提案手法

EndoVis 2017および2018データセット上で、境界ボックスおよび単一点プロンプトを用いてバイナリおよび機器別分割を生成する。
誤分類を抑制するため、境界ボックスにラベルを付け、プロンプトから機器別ラベルを導出する。
ロバストネス指標に従い、18種類の破損タイプを5段階で評価する。
デフォルト設定での手術シーンに対するSAMの未プロンプト自動マスク生成を評価する。

実験結果

リサーチクエスチョン

RQ1SAM は境界ボックスまたはポイントでプロンプトした場合、ロボット手術において正確な機器分割を達成できるか？
RQ2一般的なデータ破損・撹乱下で、他の手法と比べてSAM の性能はどのように劣化するか？
RQ3プロンプトなしで自動的に未プロンプトの手術シーン分割を信頼できるか？

主な発見

境界ボックスプロンプトを用いた場合、SAM は EndoVis 2017 および 2018 におけるバイナリおよび機器別分割で従来のいくつかの監視型手法よりも性能が上回る。
単一点プロンプトは性能を著しく低下させ、強力なプロンプトへの強い依存を示す。
複雑なシーンや同じ境界ボックス内で器具が重なる場合、全体の器具分割は困難。
データ破損下では、JPEGおよびガウシアンノイズを含む多くの破損タイプ・深刻さで顕著な性能低下を示す。
未プロンプトの自動マスク生成は手術シーンで断片的なマスクや限定的な機器意味情報となり、実用的な下流用途を制限する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。