QUICK REVIEW

[论文解读] Selective Kernel Networks

Xiang Li, Wenhai Wang|arXiv (Cornell University)|Mar 15, 2019

Advanced Neural Network Applications参考文献 54被引用 95

一句话总结

Selective Kernel Networks (SKNets) 引入一种动态的、由注意力引导的机制，在 CNN 中自适应地选择卷积核大小，在 ImageNet 和 CIFAR 上以与以往架构相似的复杂度实现更好的对象识别。

ABSTRACT

In standard Convolutional Neural Networks (CNNs), the receptive fields of artificial neurons in each layer are designed to share the same size. It is well-known in the neuroscience community that the receptive field size of visual cortical neurons are modulated by the stimulus, which has been rarely considered in constructing CNNs. We propose a dynamic selection mechanism in CNNs that allows each neuron to adaptively adjust its receptive field size based on multiple scales of input information. A building block called Selective Kernel (SK) unit is designed, in which multiple branches with different kernel sizes are fused using softmax attention that is guided by the information in these branches. Different attentions on these branches yield different sizes of the effective receptive fields of neurons in the fusion layer. Multiple SK units are stacked to a deep network termed Selective Kernel Networks (SKNets). On the ImageNet and CIFAR benchmarks, we empirically show that SKNet outperforms the existing state-of-the-art architectures with lower model complexity. Detailed analyses show that the neurons in SKNet can capture target objects with different scales, which verifies the capability of neurons for adaptively adjusting their receptive field sizes according to the input. The code and models are available at https://github.com/implus/SKNet.

研究动机与目标

促使通过对输入刺激动态调整感受野大小来改进 CNN 的多尺度特征提取。
提出一种轻量级的 Selective Kernel (SK) 卷积，通过注意力引导的融合将多种卷积核结合起来。
展示 SKNet 在 ImageNet 和 CIFAR 上以与先前最先进模型可比或更低的复杂度实现更高的准确性。
通过消融研究展示不同卷积核配置和注意力机制如何影响性能。
提供对选择性机制如何跨层和跨类别自适应卷积核大小的分析。

提出的方法

引入具有三种操作的 Selective Kernel (SK) 卷积：Split（生成不同尺寸的多条卷积路径）、Fuse（聚合分支信息并产生通道级统计量）、Select（使用软注意力对分支进行加权）。
用 SK 卷积替代 ResNeXt 风格骨干网中的传统大核块，以获得 SK 单元。
使用缩减比率 r 来控制 Fuse 阶段的瓶颈，以及对降维维度 d= max(C/r, L) 的上限 L。
在 SK 分支内应用分组/深度卷积/膨胀卷积以控制成本，同时实现多尺度信息聚合。
将 SK 单元堆叠成 SKNet 架构（如 SKNet-50、SKNet-101），具有可配置的路径 M、分组 G 和缩减 r。
在 ImageNet、CIFAR-10/100 以及轻量级模型上进行评估，展示性能和参数效率的好处。

实验结果

研究问题

RQ1在单层内自适应选择卷积核大小，是否比固定多分支或单分支卷积在识别准确性上有所提升？
RQ2随着目标对象尺度的变化和网络深度的变化，SK 注意力机制如何在不同卷积核尺度之间分配注意力？
RQ3与 ResNeXt/SENet 骨干相比，SK 卷积是否在参数数量和 FLOPs 相近或更低的情况下提供准确度提升？
RQ4对于不同的架构和数据集，最优的 SK 超参数（M、G、r）是什么？
RQ5在紧凑模型和像 CIFAR 这样的小数据集上，SK nets 能否保持或提升性能？

主要发现

SKNet-50 相对于 ResNeXt-50 在复杂度可比的情况下提升了 top-1 准确率，显示自适应卷积核选择的好处。
在 ImageNet 上，SKNet 架构在相似预算下相对于其他基于注意力的 CNN 取得了最先进的性能。
消融研究表明，使用带有 SK 注意力的多卷积核比简单将分支相加的误差更低，并且通常增加路径数量有帮助，但在 M=2 或 M=3 之后收益递减。
跨分支的软注意力机制使感受野尺寸自适应，能够对输入尺度做出响应，特别是在较低/中间层。
SK 卷积提升小型模型的性能（如 ShuffleNetV2 变体），并对紧凑架构有效。
CIFAR-10/100 的结果表明 SKNet-29 以更少的参数实现竞争力甚至更高的准确性，相较于 ResNeXt/SENet 基线。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。