QUICK REVIEW

[论文解读] SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective

An Wang, Mobarakol Islam|arXiv (Cornell University)|Apr 28, 2023

Artificial Intelligence in Healthcare and Education被引用 12

一句话总结

SAM 在带边界框提示的机器人器械分割任务中显示出强烈的零样本泛化能力，但在单点提示和未提示设置下表现不佳，并且在各种损坏/扰动下鲁棒性不足。

ABSTRACT

Segment Anything Model (SAM) is a foundation model for semantic segmentation and shows excellent generalization capability with the prompts. In this empirical study, we investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery in various settings of (i) prompted vs. unprompted; (ii) bounding box vs. points-based prompt; (iii) generalization under corruptions and perturbations with five severity levels; and (iv) state-of-the-art supervised model vs. SAM. We conduct all the observations with two well-known robotic instrument segmentation datasets of MICCAI EndoVis 2017 and 2018 challenges. Our extensive evaluation results reveal that although SAM shows remarkable zero-shot generalization ability with bounding box prompts, it struggles to segment the whole instrument with point-based prompts and unprompted settings. Furthermore, our qualitative figures demonstrate that the model either failed to predict the parts of the instrument mask (e.g., jaws, wrist) or predicted parts of the instrument as different classes in the scenario of overlapping instruments within the same bounding box or with the point-based prompt. In fact, it is unable to identify instruments in some complex surgical scenarios of blood, reflection, blur, and shade. Additionally, SAM is insufficiently robust to maintain high performance when subjected to various forms of data corruption. Therefore, we can argue that SAM is not ready for downstream surgical tasks without further domain-specific fine-tuning.

研究动机与目标

在带边界框和点提示下评估 SAM 在机器人手术中的零样本分割性能。
在 EndoVis 2017 和 2018 数据集上评估 SAM 对数据损坏和扰动的鲁棒性。
研究 SAM 在未提示的手术场景中的自动遮罩生成。
将 SAM 与传统监督分割方法在二值分割和器械级分割任务中进行比较。

提出的方法

使用带边界框和单点提示的 SAM，在 EndoVis 2017 和 2018 数据集上生成二值分割和器械级分割。
对边界框进行标注，并从提示中推导器械级标签，以减小错误分类。
按照鲁棒性基准，在 18 种损坏类型的 5 个严重程度水平下评估 SAM。
在手术场景下使用默认设置评估 SAM 的未提示自动遮罩生成。

实验结果

研究问题

RQ1在带边界框或点提示的情况下，SAM 是否能够在机器人手术中实现准确的器械分割？
RQ2与其他方法相比，SAM 在常见数据损坏和扰动下的性能衰退程度如何？
RQ3在未提示的情况下，SAM 的自动未提示手术场景分割是否可靠？

主要发现

在边界框提示下，SAM 在 EndoVis 2017 和 2018 的二值分割和器械级分割方面优于若干前期的监督方法。
单点提示显著降低性能，表明对强提示的高度依赖。
SAM 在复杂场景中或同一边界框内存在重叠的器械时，难以完整分割整个器械。
在数据损坏下，SAM 在大多数损坏类型和严重程度上表现显著下降，JPEG 和高斯噪声影响尤为大。
未提示的自动遮罩生成在手术场景中产生碎片化遮罩、器械语义有限，限制了实际下游应用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。