QUICK REVIEW

[论文解读] SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention

Yunzhong Si, Huiying Xu|arXiv (Cornell University)|Jul 6, 2024

Human-Automation Interaction and Safety被引用 8

一句话总结

SCSA 引入一个可插拔的时空协同注意力模块，结合可分享的多语义空间注意力（SMSA）和渐进通道自注意力（PCSA），通过利用多语义空间先验来引导通道学习，从而提升在分类、检测和分割任务中的性能。

ABSTRACT

Channel and spatial attentions have respectively brought significant improvements in extracting feature dependencies and spatial structure relations for various downstream vision tasks. While their combination is more beneficial for leveraging their individual strengths, the synergy between channel and spatial attentions has not been fully explored, lacking in fully harness the synergistic potential of multi-semantic information for feature guidance and mitigation of semantic disparities. Our study attempts to reveal the synergistic relationship between spatial and channel attention at multiple semantic levels, proposing a novel Spatial and Channel Synergistic Attention module (SCSA). Our SCSA consists of two parts: the Shareable Multi-Semantic Spatial Attention (SMSA) and the Progressive Channel-wise Self-Attention (PCSA). SMSA integrates multi-semantic information and utilizes a progressive compression strategy to inject discriminative spatial priors into PCSA's channel self-attention, effectively guiding channel recalibration. Additionally, the robust feature interactions based on the self-attention mechanism in PCSA further mitigate the disparities in multi-semantic information among different sub-features within SMSA. We conduct extensive experiments on seven benchmark datasets, including classification on ImageNet-1K, object detection on MSCOCO 2017, segmentation on ADE20K, and four other complex scene detection datasets. Our results demonstrate that our proposed SCSA not only surpasses the current state-of-the-art attention but also exhibits enhanced generalization capabilities across various task scenarios. The code and models are available at: https://github.com/HZAI-ZJNU/SCSA.

研究动机与目标

研究空间信息如何引导并增强通道注意力，以减小跨多语义特征表示的语义差异。
开发一个轻量、即插即用的注意力模块，将空间计算和通道计算分解以减少参数量和计算量。
证明所提出的 SCSA 在分类、检测和分割基准上的泛化性和有效性。

提出的方法

使用多尺度、深度共享的一维卷积和组归一化，将输入特征分解为多语义的空间子特征，以保持语义区分。
引入 SMSA 通过对子特征的多语义空间信息进行聚合，结合 GN 基准化和 Sigmoid 激活，生成空间先验。
提出 PCSA 以渐进压缩和沿通道维度的自注意力，在通道维度上计算通道间关系，并由 SMSA 先验引导。
将 SMSA 与 PCSA 以串联配置集成成 SCSA：SCSA(X) = PCSA(SMSA(X)).
在 ImageNet-1K 上进行分类，在 MS COCO 2017 上进行对象检测和实例分割，在 ADE20K 上进行语义分割的评估，并与 state-of-the-art 注意力模块进行对比。

实验结果

研究问题

RQ1由多语义信息引导的空间注意力是否能改善对通道注意力的学习，并减小子特征之间的语义差异？
RQ2一个轻量、渐进压缩的通道自注意力（PCSA）是否能有效利用空间先验来提升特征重新校准？
RQ3与现有注意力机制相比，SCSA 在不同任务（分类、检测、分割）和数据集上的表现如何？

主要发现

SCSA 在 ImageNet-1K 的 ResNet-50/101 及 MobileNetV2-1.0 上持续提升 Top-1 准确率，优于其他注意力模块。
SCSA 在 ADE20K 分割任务上获得更高的 mIoU，在 MS COCO 的目标检测/实例分割任务中获得更高的 AP，在多种设置下优于竞争方法。
消融研究表明 SMSA 能显著提升准确性，PCSA 的渐进压缩在保持空间先验的同时成本较低，串联结构（先 SMSA 再 PCSA）具有优势。
在子特征之间的 GN 基准化相较 BN 可以减少语义干扰并改善对空间先验的利用。
SCSA 展示出对多种主干模型和任务的强泛化能力，并在准确性/效率权衡方面具有有利表现。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。