QUICK REVIEW

[论文解读] Channel Pruning for Accelerating Very Deep Neural Networks

Yihui He, Xiangyu Zhang|arXiv (Cornell University)|Jul 19, 2017

Advanced Neural Network Applications参考文献 45被引用 351

一句话总结

本文提出一种在推理时进行通道裁剪的方法，使用基于LASSO的通道选择和最小二乘重构来加速非常深的CNN，在VGG-16、ResNet和Xception架构上实现显著的加速，且精度损失极小。

ABSTRACT

In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction. We further generalize this algorithm to multi-layer and multi-branch cases. Our method reduces the accumulated error and enhance the compatibility with various architectures. Our pruned VGG-16 achieves the state-of-the-art results by 5x speed-up along with only 0.3% increase of error. More importantly, our method is able to accelerate modern networks like ResNet, Xception and suffers only 1.4%, 1.0% accuracy loss under 2x speed-up respectively, which is significant. Code has been made publicly available.

研究动机与目标

Motivate and develop a practical channel pruning method to accelerate very deep CNNs at inference time.
Reduce feature map channels layer-by-layer while controlling reconstruction error.
Ensure compatibility with multi-branch architectures (e.g., ResNet, Xception).
Demonstrate effectiveness on modern networks (VGG-16, ResNet-50, Xception-50) across ImageNet and other datasets.
Show that pruning can be combined with other efficiency techniques for greater speed-ups.

提出的方法

Formulate channel pruning as minimizing reconstruction error of the output feature maps after pruning input channels (Eq. 1).
Relax the NP-hard l0 sparsity to an l1 penalty and alternate between solving for channel selection via LASSO (Eq. 3) and reconstructing the remaining outputs via least squares (Eq. 4).
Apply the layer-wise pruning sequentially to account for accumulated error (Eq. 5).
Extend the approach to multi-branch networks by sampling/pruning input channels in shared paths and addressing the last layer of residual blocks to recover Y1+Y2 (Sec. 3.3).
Introduce variants like multi-branch enhancement to better handle residual connections (Sec. 3 Last layer and Sec. 3 First layer).
Fine-tune pruned models briefly to recover accuracy (10-20 epochs as reported) and compare against training-from-scratch baselines.

实验结果

研究问题

RQ1Can channels be pruned at inference time without retraining from scratch while preserving accuracy?
RQ2How can inter-channel redundancy be exploited to select representative channels and accurately reconstruct outputs?
RQ3How does the method perform across single-branch networks (VGG-16) and multi-branch architectures (ResNet, Xception)?

主要发现

Achieves up to 5x acceleration on VGG-16 with only 0.3% increase in top-5 error when combined with tensor factorization (and beyond 4x with small accuracy loss).
For ResNet-50 and Xception-50, the method achieves around 2x speed-up with 1.4% and 1.0% accuracy loss respectively (without and with fine-tuning).
Pruning two layers at a time with the proposed approach consistently outperforms naive channel selection baselines (first k channels, max response) in reconstruction error (single-layer pruning results).
Sequential, layer-wise pruning with accumulated error accounted yields competitive absolute GPU speedups without specialized libraries (Table 3).
Multi-branch enhancement improves pruning effectiveness in residual blocks by better handling shortcut connections (improved 4.0% top-5 accuracy in ResNet-50 with enhancement).
Combining channel pruning with spatial and channel factorization (3C) yields the best reported reductions (e.g., 4x or 5x on VGG-16) with relatively small accuracy losses.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。