QUICK REVIEW

[论文解读] Visual Transformer Pruning.

Mingjian Zhu, Kai Han|arXiv (Cornell University)|Apr 17, 2021

Image Enhancement Techniques参考文献 27被引用 23

一句话总结

本文提出了一种用于视觉Transformer的通道剪枝方法，通过稀疏性正则化识别并移除不重要的通道，实现在极小精度损失下实现高倍率压缩。该方法——训练时加入正则化、剪枝低影响通道、再微调——在ImageNet上显著降低了参数量和FLOPs，同时保持了较强的精度。

ABSTRACT

Visual transformer has achieved competitive performance on a variety of computer vision applications. However, their storage, run-time memory, and computational demands are hindering the deployment on mobile devices. Here we present an visual transformer pruning approach, which identifies the impacts of channels in each layer and then executes pruning accordingly. By encouraging channel-wise sparsity in the Transformer, important channels automatically emerge. A great number of channels with small coefficients can be discarded to achieve a high pruning ratio without significantly compromising accuracy. The pipeline for visual transformer pruning is as follows: 1) training with sparsity regularization; 2) pruning channels; 3) finetuning. The reduced parameters and FLOPs ratios of the proposed algorithm are well evaluated and analyzed on ImageNet dataset to demonstrate its effectiveness.

研究动机与目标

解决视觉Transformer因计算和内存需求过高而难以在移动端部署的问题。
在不损害性能的前提下识别并移除视觉Transformer中的冗余通道。
开发一种剪枝流程，在实现高倍率压缩的同时保持高精度。
在ImageNet数据集上评估该方法的有效性。

提出的方法

在训练过程中应用稀疏性正则化，以促进视觉Transformer中通道级别的稀疏性。
根据其重要性评分，在训练后识别并移除系数较小的通道。
通过整体移除整个通道而非单个权重实现结构化剪枝。
对剪枝后的模型进行微调，以恢复剪枝过程中损失的精度。
采用三阶段流程：(1) 稀疏训练，(2) 通道剪枝，(3) 微调。
在ImageNet上评估最终模型在参数量减少和FLOPs节省方面的性能。

实验结果

研究问题

RQ1稀疏性正则化能否有效识别视觉Transformer中可用于剪枝的不重要通道？
RQ2在ImageNet上保持竞争力精度的前提下，可实现的最大剪枝比率是多少？
RQ3与端到端训练相比，三阶段剪枝流程（训练-剪枝-微调）在效率和精度方面表现如何？
RQ4在不造成显著性能下降的前提下，参数量和FLOPs可降低到何种程度？

主要发现

所提出的剪枝方法在ImageNet数据集上显著降低了模型参数量和FLOPs。
由于通过稀疏性正则化识别了关键通道，因此可在极小精度损失下实现高剪枝比率。
三阶段流程——稀疏训练、剪枝和微调——在压缩后有效保持了模型性能。
该方法通过降低计算和内存需求，使视觉Transformer在资源受限的移动端设备上实现高效部署。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。