Skip to main content
QUICK REVIEW

[论文解读] MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation

Sina Ghorbani Kolahi, S. Kamal Chaharsooghi|arXiv (Cornell University)|Jul 31, 2024
Brain Tumor Detection and Classification被引用 5
一句话总结

MSA2Net 引入 MASAG,即混合 CNN-Transformer 框架中的多尺度自适应空间注意门,用于融合编码器-解码器特征以实现高精度的医学图像分割,在 Synapse 和 ISIC2018 数据集上达到最先进的结果。

ABSTRACT

Medical image segmentation involves identifying and separating object instances in a medical image to delineate various tissues and structures, a task complicated by the significant variations in size, shape, and density of these features. Convolutional neural networks (CNNs) have traditionally been used for this task but have limitations in capturing long-range dependencies. Transformers, equipped with self-attention mechanisms, aim to address this problem. However, in medical image segmentation it is beneficial to merge both local and global features to effectively integrate feature maps across various scales, capturing both detailed features and broader semantic elements for dealing with variations in structures. In this paper, we introduce MSA$^2$Net, a new deep segmentation framework featuring an expedient design of skip-connections. These connections facilitate feature fusion by dynamically weighting and combining coarse-grained encoder features with fine-grained decoder feature maps. Specifically, we propose a Multi-Scale Adaptive Spatial Attention Gate (MASAG), which dynamically adjusts the receptive field (Local and Global contextual information) to ensure that spatially relevant features are selectively highlighted while minimizing background distractions. Extensive evaluations involving dermatology, and radiological datasets demonstrate that our MSA$^2$Net outperforms state-of-the-art (SOTA) works or matches their performance. The source code is publicly available at https://github.com/xmindflow/MSA-2Net.

研究动机与目标

  • 解决医学影像中器官大小、形状和密度的变异性。
  • 通过自适应跳跃连接在编码器与解码器之间融合局部与全局特征。
  • 开发一个模块(MASAG)以动态重新校准感受野并突出空间相关特征。
  • 在多器官的 Synapse 和 ISIC2018 数据集上验证该方法,并使用边界感知损失。

提出的方法

  • 提出 MASAG(多尺度自适应空间注意门)以在带有动态感受野的条件下融合编码器和解码器特征。
  • 采用混合解码器,浅层使用 Large Kernel Attention (LKA),深层使用 Dual Attention Enhanced Transformer (DAE-Former) 模块。
  • 实现多尺度特征融合,结合局部上下文提取和全局上下文提取。
  • 结合空间选择、空间交互、跨调制和重新校准,以细化特征图实现准确分割。
  • 采用基于 MaxViT 的编码器,带有预训练权重,并使用边界感知的 BDoU 损失进行边界描绘。
  • 在 Synapse(多器官 CT)和 ISIC2018(皮肤病变)数据集上进行评估。
Figure 1: Our proposed segmentation network, called MSA ${}^{\text{2}}$ Net, is composed of an encoder (using pretrained MaxViT block) and a decoder (comprising DAE-Former blocks in shallow layers and LKA blocks in deeper ones). The encoding-decoding feature fusion is performed via our novel MASAG m
Figure 1: Our proposed segmentation network, called MSA ${}^{\text{2}}$ Net, is composed of an encoder (using pretrained MaxViT block) and a decoder (comprising DAE-Former blocks in shallow layers and LKA blocks in deeper ones). The encoding-decoding feature fusion is performed via our novel MASAG m

实验结果

研究问题

  • RQ1MASAG 能否动态重新校准感受野,以在不同尺度的目标上提升分割效果?
  • RQ2带 MASAG 的混合编码器-解码器是否在医学图像分割方面优于最先进的 CNN-Transformer 模型?
  • RQ3局部和全局上下文融合,以及自适应跳跃连接,如何影响边界精度以及整体 DSC/HD95 指标?
  • RQ4该方法在放射科与皮肤科成像模态中是否具有鲁棒性?

主要发现

方法参数(M)FLOPs(G)Spl.RKid.LKid.Gal.Liv.Sto.Aor.Pan.AverageDSC ↑HD95 ↓
TransUNet [ Chen et al.(2021) ]96.0788.9185.0877.0281.8763.1694.0875.6287.2355.8677.4931.69
Swin-UNet [ Cao et al.(2022) ]27.176.1690.6679.6183.2866.5394.2976.6085.4756.5879.1321.55
MISSFormer [ Huang et al.(2021) ]42.469.8991.9282.0085.2168.6594.4180.8186.9965.6781.9618.20
ScaleFormer [ Huang et al.(2022) ]111.648.9389.4083.3186.3674.9795.1280.1488.7364.8582.8616.81
HiFormer-B [ Heidari et al.(2023) ]25.518.04590.9979.7785.2365.2394.6181.0886.2159.5280.3914.70
DAEFormer [ Azad et al.(2023a) ]48.0727.8991.8282.3987.6671.6595.0880.7787.8463.9382.6316.39
2D D-LKA Net [ Azad et al.(2023b) ]101.6419.9291.2284.9288.3873.7994.8884.9488.3467.7184.2720.04
MSA${}^{2}$ Net (Ours)112.7715.5692.6984.2488.3074.3595.5984.0389.4769.3084.7513.29
  • MSA2Net 在 Synapse 数据集上达到 DSC 84.75 和 HD95 13.29,超越若干 SOTA 基线。
  • 在 ISIC2018 上,MSA2Net 达到 DSC 0.9129、SE 0.8840、SP 0.9557、ACC 0.9640,优于多种先前方法。
  • 消融研究表明,以不同配置包含 MASAG、LKA 和 DAE-Former 会带来渐进提升,且三者共存时获得最佳 Dice/HD95 权衡。
  • MSA2Net 在 Synapse 的胰腺和主动脉分割上表现出显著改进,说明对小型和大型器官动态感受野重新校准的好处。
Figure 2: A comparative visual examination of the proposed approach in contrast to different methods employing the Synapse multi-organ segmentation dataset.
Figure 2: A comparative visual examination of the proposed approach in contrast to different methods employing the Synapse multi-organ segmentation dataset.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。