QUICK REVIEW

[论文解读] Training CNNs with Low-Rank Filters for Efficient Image Classification

Yani Ioannou, Duncan Robertson|arXiv (Cornell University)|Nov 20, 2015

Advanced Neural Network Applications参考文献 17被引用 40

一句话总结

本论文提出从零开始训练CNN，采用低秩、复合卷积核——具体而言，将水平（1×k）和垂直（k×1）滤波器作为基函数进行组合——从而实现显著的计算和参数效率提升。通过为混合形状滤波器组设计一种新型权重初始化方案，该方法在CIFAR-10、ILSVRC和MIT Places数据集上的实验表明，相比标准CNN，其模型参数减少高达55%，计算量降低46%，同时保持相当或更高的准确率。

ABSTRACT

We propose a new method for creating computationally efficient convolutional neural networks (CNNs) by using low-rank representations of convolutional filters. Rather than approximating filters in previously-trained networks with more efficient versions, we learn a set of small basis filters from scratch; during training, the network learns to combine these basis filters into more complex filters that are discriminative for image classification. To train such networks, a novel weight initialization scheme is used. This allows effective initialization of connection weights in convolutional layers composed of groups of differently-shaped filters. We validate our approach by applying it to several existing CNN architectures and training these networks from scratch using the CIFAR, ILSVRC and MIT Places datasets. Our results show similar or higher accuracy than conventional CNNs with much less compute. Applying our method to an improved version of VGG-11 network using global max-pooling, we achieve comparable validation accuracy using 41% less compute and only 24% of the original VGG-11 model parameters; another variant of our method gives a 1 percentage point increase in accuracy over our improved VGG-11 model, giving a top-5 center-crop validation accuracy of 89.7% while reducing computation by 16% relative to the original VGG-11 model. Applying our method to the GoogLeNet architecture for ILSVRC, we achieved comparable accuracy with 26% less compute and 41% fewer model parameters. Applying our method to a near state-of-the-art network for CIFAR, we achieved comparable accuracy with 46% less compute and 55% fewer parameters.

研究动机与目标

解决最先进的CNN在低功耗设备上部署时日益增长的计算成本和模型尺寸问题。
在不牺牲分类准确率的前提下，降低卷积层的计算复杂度。
探究从零开始学习低秩滤波器是否能超越对预训练模型的近似，从而在效率和泛化能力方面实现进一步提升。
为具有不同形状滤波器（如1×k、k×1、k×k）的复合卷积层开发一种新型权重初始化方法。

提出的方法

将卷积滤波器表示为小规模、低秩基滤波器（如1×k和k×1）的线性组合，而非完整的k×k核。
通过一种新型权重初始化方案，从零开始训练网络，该方案考虑了混合形状滤波器组的结构差异。
使用矩形和方形滤波器构成的基空间，以实现对复杂空间模式的高效、可学习表示。
通过在关键层用低秩等效滤波器替换标准滤波器，将该方法应用于现有架构（VGG-11、GoogLeNet、Network-in-Network）。
通过基分解限制滤波器复杂度，同时优化推理效率和泛化能力。
通过全局最大池化和架构修改，进一步减小模型尺寸和计算量。

实验结果

研究问题

RQ1在使用低秩、复合滤波器从零开始训练CNN时，是否能以显著降低的计算成本，实现与标准CNN相当或更高的准确率？
RQ2与近似预训练模型相比，从零开始学习基滤波器是否能带来更好的泛化能力和效率？
RQ3所提出的权重初始化方案在训练同一层中具有异质滤波器形状的网络时，其有效性如何？
RQ4低秩滤波器分解（如1×k和k×1）在图像分类任务中，能在多大程度上表征完整k×k滤波器的判别性模式？
RQ5该方法是否能在多样化的数据集（CIFAR-10、ILSVRC、MIT Places）和架构（VGG、GoogLeNet、NiN）上保持一致的效率增益？

主要发现

将该方法应用于改进的VGG-11（加入全局最大池化）后，实现了89.7%的top-5中心裁剪准确率——与原始VGG-11相当或更优，同时计算量减少41%，模型参数减少76%。
该方法的一种变体在准确率上比改进后的VGG-11高出1个百分点，达到89.7%的top-5准确率，同时计算量减少16%。
在GoogLeNet上，低秩版本实现了与原始模型相当的ILSVRC准确率（88.0% top-5），计算量减少26%，参数减少41%。
对于接近最先进水平的CIFAR-10模型（NiN），低秩变体实现了91.8%的准确率，计算量减少46%，参数减少55%。
该方法在效率方面优于先前方法，没有任何其他网络在计算量减少一个数量级的范围内实现相当的准确率。
所提出的权重初始化方案对成功训练至关重要，它使具有混合形状滤波器组的网络实现了稳定收敛。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。