[Paper Review] ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
ECA-Net introduces an Efficient Channel Attention module that forgoes dimensionality reduction and uses a 1D convolution with adaptive kernel size to capture local cross-channel interactions, yielding performance gains with minimal extra parameters. It achieves competitive or superior results across ImageNet classification and COCO detection/segmentation with lower complexity than existing attention modules.
Recently, channel attention mechanism has demonstrated to offer great potential in improving the performance of deep convolutional neural networks (CNNs). However, most existing methods dedicate to developing more sophisticated attention modules for achieving better performance, which inevitably increase model complexity. To overcome the paradox of performance and complexity trade-off, this paper proposes an Efficient Channel Attention (ECA) module, which only involves a handful of parameters while bringing clear performance gain. By dissecting the channel attention module in SENet, we empirically show avoiding dimensionality reduction is important for learning channel attention, and appropriate cross-channel interaction can preserve performance while significantly decreasing model complexity. Therefore, we propose a local cross-channel interaction strategy without dimensionality reduction, which can be efficiently implemented via $1D$ convolution. Furthermore, we develop a method to adaptively select kernel size of $1D$ convolution, determining coverage of local cross-channel interaction. The proposed ECA module is efficient yet effective, e.g., the parameters and computations of our modules against backbone of ResNet50 are 80 vs. 24.37M and 4.7e-4 GFLOPs vs. 3.86 GFLOPs, respectively, and the performance boost is more than 2% in terms of Top-1 accuracy. We extensively evaluate our ECA module on image classification, object detection and instance segmentation with backbones of ResNets and MobileNetV2. The experimental results show our module is more efficient while performing favorably against its counterparts.
Motivation & Objective
- Motivate and analyze channel attention mechanisms with respect to model complexity and performance.
- Propose a lightweight attention module that avoids dimensionality reduction while capturing cross-channel interactions.
- Demonstrate that adaptive kernel size for 1D convolution yields effective channel attention.
- Evaluate ECA-Net across image classification, object detection, and instance segmentation tasks.
Proposed method
- Revisit the SE block to analyze effects of dimensionality reduction and cross-channel interaction.
- Propose ECA: replace fully connected excitation with a 1D convolution (C1D) over channel-wise pooled features without dimensionality reduction.
- Use an adaptive kernel size k for the 1D convolution, determined by a non-linear mapping of channel dimension C.
- Implement ECA as a plug-in module replacing SE blocks in existing backbones (ECA-Net).
- Provide PyTorch implementation and report parameter count, FLOPs, and accuracy improvements.
Experimental results
Research questions
- RQ1Does avoiding dimensionality reduction improve channel attention learning compared to SE blocks?
- RQ2Can local cross-channel interaction captured by a lightweight 1D convolution achieve competitive gains with minimal parameters?
- RQ3Is an adaptive kernel size for 1D convolution beneficial across different CNN backbones and tasks?
- RQ4How does ECA-Net perform on ImageNet classification and COCO object detection/instance segmentation compared to peers?
Key findings
- ECA without dimensionality reduction consistently outperforms variants with reduction and achieves gains with far fewer parameters.
- A 1D convolution with kernel size k (adaptive via channel dimension) effectively models local cross-channel interaction.
- For ResNet-50 with 24.37M parameters, ECA-Net adds 80 parameters and 4.7e-4 GFLOPs, while improving Top-1 by 2.28%.
- ECA-Net shows competitive or superior performance to SENet/CBAM/GCNet/A2-Net across ImageNet, with lower complexity.
- On MobileNetV2, ECA-Net yields accuracy gains with minimal parameter and FLOP increase compared to SE blocks.
- ECA also provides improvements for MS COCO detectors (Faster R-CNN, Mask R-CNN, RetinaNet) over baseline ResNet and SE blocks, including better performance on small objects.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.