QUICK REVIEW

[论文解读] Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

Huizi Mao, Song Han|arXiv (Cornell University)|May 24, 2017

Advanced Neural Network Applications参考文献 21被引用 230

一句话总结

该论文系统地研究修剪粒度如何影响准确性、存储和硬件效率，结果显示粗粒度稀疏在同等稀疏度下可实现相当的准确性，并大幅降低内存引用，助力硬件加速。

ABSTRACT

Sparsity helps reduce the computational complexity of deep neural networks by skipping zeros. Taking advantage of sparsity is listed as a high priority in next generation DNN accelerators such as TPU. The structure of sparsity, i.e., the granularity of pruning, affects the efficiency of hardware accelerator design as well as the prediction accuracy. Coarse-grained pruning creates regular sparsity patterns, making it more amenable for hardware acceleration but more challenging to maintain the same accuracy. In this paper we quantitatively measure the trade-off between sparsity regularity and prediction accuracy, providing insights in how to maintain accuracy while having more a more structured sparsity pattern. Our experimental results show that coarse-grained pruning can achieve a sparsity ratio similar to unstructured pruning without loss of accuracy. Moreover, due to the index saving effect, coarse-grained pruning is able to obtain a better compression ratio than fine-grained sparsity at the same accuracy threshold. Based on the recent sparse convolutional neural network accelerator (SCNN), our experiments further demonstrate that coarse-grained sparsity saves about 2x the memory references compared to fine-grained sparsity. Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.

研究动机与目标

评估修剪粒度（0-D 到 3-D）在固定稀疏度下对 CNN 准确性的影响。
评估不同稀疏结构对存储和内存访问的影响。
使用 CNN 加速器模型量化粗粒度稀疏带来的硬件效率收益。
就选择修剪粒度以在准确性和硬件成本之间取得平衡提供指导。

提出的方法

定义四种修剪粒度：0-D（权重）、1-D（子核向量）、2-D（核）、3-D（滤波器）。
应用基于幅值的迭代修剪以去除跨层中 L1 显著性最小的粒。
在 AlexNet 的 ImageNet 上进行训练/评估，并在相同稀疏度和训练计划下，与 VGG-16、GoogLeNet、ResNet-50、DenseNet-121 进行比较。
使用 8 位权重存储和 4 位索引来研究存储影响和量化兼容性。
使用类似 SCNN 的加速器模型进行定性和定量分析硬件影响。

实验结果

研究问题

RQ1CNN 修剪中稀疏规律性（粒度）与预测准确性之间的权衡是什么？
RQ2在相同的准确性水平下，粗粒度修剪能否实现与细粒度修剪相似或更好的压缩？
RQ3稀疏粒度如何影响实际的存储需求和内存引用？
RQ4不同稀疏粒度对硬件实现和加速器设计有哪些影响？

主要发现

与在低稀疏度下的细粒度修剪相比，在相同稀疏度下，粗粒度修剪可以达到相同或略微提高的准确性。
大粒度修剪（过滤器）会导致显著的准确性损失，而较小粒度（核、向量）则能保持与细粒度修剪相似的准确性。
粗粒度稀疏通过索引共享实现更高的压缩，在相同准确性下达到同等或更好的存储效率。
粗粒度修剪在相同密度下可将内存引用减少约 30–35%，有助于能效。
来自粗粒度稀疏的索引节省促使提升硬件效率和更简单的加速器设计。
在各模型（AlexNet、VGG-16、GoogLeNet、ResNet-50、DenseNet-121）中，粗粒度修剪可以在保持 ImageNet 上高 Top-5 准确度的同时，显著减少存储和内存引用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。