Skip to main content
QUICK REVIEW

[论文解读] The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot

Lucas Prado Osco, Qiusheng Wu|arXiv (Cornell University)|Jun 29, 2023
Remote-Sensing Image Classification被引用 20
一句话总结

本研究在多尺度的 UAV、机载和卫星影像的遥感应用中评估 SAM,提出使用 GroundingDINO 的一次性、文本提示驱动的增强,并分享用于地理空间 SAM 改造的开源代码。

ABSTRACT

Segmentation is an essential step for remote sensing image processing. This study aims to advance the application of the Segment Anything Model (SAM), an innovative image segmentation model by Meta AI, in the field of remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning, making it a promising approach to processing aerial and orbital images from diverse geographical contexts. Our exploration involved testing SAM across multi-scale datasets using various input prompts, such as bounding boxes, individual points, and text descriptors. To enhance the model's performance, we implemented a novel automated technique that combines a text-prompt-derived general example with one-shot training. This adjustment resulted in an improvement in accuracy, underscoring SAM's potential for deployment in remote sensing imagery and reducing the need for manual annotation. Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis. We recommend future research to enhance the model's proficiency through integration with supplementary fine-tuning techniques and other networks. Furthermore, we provide the open-source code of our modifications on online repositories, encouraging further and broader adaptations of SAM to the remote sensing domain.

研究动机与目标

  • 评估 SAM 在多样化遥感数据集(UAV、机载、卫星)上的零-shot 分割性能。
  • 开发并评估一种一-shot、基于文本提示的微调方法,以提升 SAM 对遥感对象的分割性能。
  • 比较提示模态(边界框、点、文本)在遥感影像分割质量上的差异。
  • 提供开源工具,以支持基于 SAM 的地理空间分割工作流。

提出的方法

  • 将 SAM(ViT-H 主干)适配到遥感数据和提示(零-shot 与一-shot)。
  • 评估包括边界框、点和文本描述符在内的提示,以及在 GroundingDINO 引导下的文本基础一-shot。
  • 实现采用 PerSAM-F 风格方法的一-shot 训练,具有两个可学习权重用于多尺度掩模,以及 Dice/Sigmoid Focal 损失。
  • 使用三个层级的数据集(UAV、机载、卫星)来检验对分辨率和目标的泛化能力。
  • 开发 SamGeo 工具包,以生成掩模、将输出合并为拼接栅格,并实现向量转换。
Figure 1: Schematic representation of the step-by-step process undertaken in this study to evaluate the efficacy of SAM’s approach in remote sensing image processing tasks.
Figure 1: Schematic representation of the step-by-step process undertaken in this study to evaluate the efficacy of SAM’s approach in remote sensing image processing tasks.

实验结果

研究问题

  • RQ1SAM 在跨 UAV、机载和卫星数据的多尺度遥感影像上执行零-shot 分割的效果如何?
  • RQ2结合文本提示与单个示例的一个-shot、文本提示驱动的增强方法,是否能提升 SAM 对遥感对象的分割?
  • RQ3在遥感场景中,哪些有效的提示模态(框选、点、文本)可以引导 SAM,且它们之间如何比较?
  • RQ4有哪些开源工具与工作流可以在实际中支持基于 SAM 的地理空间分割?

主要发现

  • SAM 展现出在 UAV、机载和卫星影像的遥感分割中的潜力,对提示类型具有灵活性。
  • 一种一-shot 的文本基础方法(结合 GroundingDINO 与 SAM)通过从文本提示中提供有针对性的对象表示来提升分割。
  • 采用 PerSAM-F 风格的微调,具有两个可学习权重用于多尺度掩模,解决遥感中常见的分层对象结构,从而获得更好的分割保真度。
  • 作者提供开源代码和一个地理空间分割软件包,以促进 SAM 适配遥感工作流。
Figure 2: Collection of image samples utilized in our research. The top row features UAV-based imagery with bounding boxes and point labels, serving as prompts for SAM. The middle row displays airborne-captured data representing larger regions, with both points and a rectangular box provided as mode
Figure 2: Collection of image samples utilized in our research. The top row features UAV-based imagery with bounding boxes and point labels, serving as prompts for SAM. The middle row displays airborne-captured data representing larger regions, with both points and a rectangular box provided as mode

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。