QUICK REVIEW

[论文解读] Dynamic Channel Pruning: Feature Boosting and Suppression

Xitong Gao, Yiren Zhao|arXiv (Cornell University)|Oct 12, 2018

Speech and Audio Processing参考文献 21被引用 212

一句话总结

本文提出 Feature Boosting and Suppression (FBS)，一种动态通道剪枝方法，在运行时预测并放大显著的卷积通道，同时抑制不重要的通道，而不会永久移除通道。

ABSTRACT

Making deep convolutional neural networks more accurate typically comes at the cost of increased computational and memory resources. In this paper, we reduce this cost by exploiting the fact that the importance of features computed by convolutional layers is highly input-dependent, and propose feature boosting and suppression (FBS), a new method to predictively amplify salient convolutional channels and skip unimportant ones at run-time. FBS introduces small auxiliary connections to existing convolutional layers. In contrast to channel pruning methods which permanently remove channels, it preserves the full network structures and accelerates convolution by dynamically skipping unimportant input and output channels. FBS-augmented networks are trained with conventional stochastic gradient descent, making it readily available for many state-of-the-art CNNs. We compare FBS to a range of existing channel pruning and dynamic execution schemes and demonstrate large improvements on ImageNet classification. Experiments show that FBS can respectively provide $5 imes$ and $2 imes$ savings in compute on VGG-16 and ResNet-18, both with less than $0.6\%$ top-5 accuracy loss.

研究动机与目标

通过利用输入相关的特征重要性，激励在不永久移除通道的情况下减少 CNN 的计算。
提出 FBS，在运行时动态放大重要通道并抑制不重要通道。
在通过输入感知的通道门控实现显著加速的同时，保留整个网络容量。
使动态机制能够通过标准 SGD 实现端到端训练。
在 ImageNet（ResNet-18、VGG-16）和 CIFAR-10 上展示相较于剪枝和动态执行基线的更优的准确性/计算权衡。

提出的方法

引入一个小型辅助预测器，用以从前一层输出估计每个通道的显著性。
用一个动态版本替换每个卷积层，通过低开销策略 pi(x) 对通道输出进行缩放并在必要时抑制。
使用一个1-winner-take-all风格的函数来选择要计算的前 dC_l 个通道（密度 d 控制权衡）。
通过一个轻量级全连接层从下采样的前一层激活中预测通道显著性 g_l(x_{l-1})。
使用标准 SGD 训练整个系统，并对 g_l(x_{l-1}) 增加稀疏性诱导的 L1 正则化。
利用 Batch Normalization 使卷积核对缩放保持不变，并实现端到端可微分训练。

实验结果

研究问题

RQ1通过 FBS 的动态通道门控，在大型数据集上是否能在不牺牲准确度的情况下减少计算？
RQ2在 ImageNet 和 CIFAR-10 上，FBS 相较于静态通道剪枝和其他动态执行方法，在加速与准确率方面的表现如何？
RQ3在不同架构中，改变密度参数 d 对准确度与 MACs 的影响是怎样的？
RQ4输入相关的显著性预测器是否能在不同层与模型之间泛化，以有效放大/抑制通道？
RQ5推理阶段 FBS 对内存和带宽有何影响？

主要发现

FBS 实现显著的计算节省（例如在 ResNet-18 上约达到 2x，在 VGG-16 上约达到 5x），在 ImageNet 上 top-5 精度损失不到 0.6%。
在可比的加速约束下，FBS 一直优于通道剪枝基线（如 Network Slimming）。
FBS 通过实现输入侧和输出侧稀疏性，减少内存访问和峰值内存使用，从而提高缓存效率。
将 FBS 与剪枝（NS）结合可以恢复大量仅靠剪枝所损失的精度，产生更有利的准确性/计算权衡。
该方法可通过标准 SGD 训练，不需要强化学习或自定义训练循环，且开源。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。