QUICK REVIEW

[论文解读] Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

Hengyuan Hu, Rui Tao Peng|arXiv (Cornell University)|Jul 12, 2016

Advanced Neural Network Applications参考文献 15被引用 741

一句话总结

本文提出 Network Trimming，一种迭代方法，通过剪枝高 APoZ（零激活）的神经元来创建更小、更高效的网络，并在权重初始化下重新训练以保持或提升精度。

ABSTRACT

State-of-the-art neural networks are getting deeper and wider. While their performance increases with the increasing number of layers and neurons, it is crucial to design an efficient deep architecture in order to reduce computational and memory costs. Designing an efficient neural network, however, is labor intensive requiring many experiments, and fine-tunings. In this paper, we introduce network trimming which iteratively optimizes the network by pruning unimportant neurons based on analysis of their outputs on a large dataset. Our algorithm is inspired by an observation that the outputs of a significant portion of neurons in a large network are mostly zero, regardless of what inputs the network received. These zero activation neurons are redundant, and can be removed without affecting the overall accuracy of the network. After pruning the zero activation neurons, we retrain the network using the weights before pruning as initialization. We alternate the pruning and retraining to further reduce zero activations in a network. Our experiments on the LeNet and VGG-16 show that we can achieve high compression ratio of parameters without losing or even achieving higher accuracy than the original network.

研究动机与目标

在网络越来越深、越来越宽的背景下设计高效深层架构的动机。
通过在大规模验证集上分析激活稀疏性来识别神经元的冗余。
开发一个迭代剪枝-再训练循环，在减少参数的同时保留性能。
提供关于选择裁剪层和设置剪枝阈值的实践性指导。

提出的方法

在大规模验证集上测量每个神经元的平均零值比例（APoZ）。
剪枝 APoZ 高于阈值的神经元（大致等于目标层平均 APoZ 的一个标准差以上）。
用祖先模型的权重初始化剪裁后的网络并重新训练（或微调）以恢复性能。
跨层迭代剪枝并重新训练以逐步降低冗余。
从经验基线（如 VGG-16、LeNet）训练网络并评估压缩与准确率的权衡。
与权重剪枝方法进行比较，并强调面向 GPU 效率的神经元级剪枝。

实验结果

研究问题

RQ1剪枝高 APoZ 的神经元是否能在不损害精度的情况下减少模型规模？
RQ2对如 VGG-16 这样的大规模架构，迭代剪枝-再训练循环是否有效？
RQ3权重初始化如何影响剪枝后再训练？
RQ4在 LeNet 和 VGG-16 等网络中，剪 pruning 哪些层能带来最大的收益？
RQ5基于 APoZ 的剪枝在计算和内存效率方面与连接剪枝方法相比如何？

主要发现

网络中存在显著冗余，许多神经元具有高 APoZ（例如在 VGG-16 的各层中）。
对高 APoZ 神经元的迭代剪枝，在重新训练后实现 2–3x 的参数压缩且不损失精度。
来自祖先模型的权重初始化对于剪枝后恢复性能至关重要。
在 VGG-16 中，剪枝 CONV5-3 和 FC6 可实现约 2.59x 的压缩，重新训练后 Top-1/Top-5 准确率提高 2–3%。
对多层进行修剪可以达到有效减小参数的效果，但需要重新训练；同时剪除最后的卷积层和全连接层可在保持或提升准确率的同时实现显著的参数减少。
经过修剪的 VGG-16 模型在参数更少、过拟合更少的情况下可以优于原始模型。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。