QUICK REVIEW

[論文レビュー] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Torsten Hoefler, Dan Alistarh|arXiv (Cornell University)|Jan 31, 2021

Machine Learning and ELM被引用数 341

ひとこと要約

このサーベイは、深層ネットワークにおける剪定と成長のためのスパース化技術を総合的に検討し、効率的な推論と訓練を実現するための手法、理論、ハードウェアの考慮事項を詳述します。

ABSTRACT

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

研究の動機と目的

ニューラルネットワークのスパース化アプローチの調査と分類。
スパース性の数学的基盤と実践的な訓練戦略を説明する。
実務者が今日のスパース性を適用するための指針を提供する。
ハードウェアへの影響と今後の研究を指針づける未解決問題を議論する。

提案手法

何を剪定するか、剪定が起こる時期、スパース性をどのように達成するかによってスパース化を分類する。
スパース性を用いた訓練の決定論的および確率的な定式化を説明する。SGD、Fisher/Hessianの視点、ベイズ変分法を含む。
訓練中の剪定-成長を説明し、モデル容量を維持するための接続再追加のメカニズムを含む。
完全な畳み込みおよびトランスフォーマーアーキテクチャにおけるスパース性を誘発する実用的手法を概説する。
疎なモデルを加速するためのソフトウェアとハードウェアの考慮事項を議論し、評価ベンチマークを提案する。

実験結果

リサーチクエスチョン

RQ1深層ニューラルネットワークにおける剪定と成長の主要なスパース化手法は何か？
RQ2スパース性は訓練ダイナミクスと一般化にどう影響するか？
RQ3実ハードウェアでの推論と訓練でスパース性を活用する効果的な方法は何か？
RQ4スパースなネットワークを比較する際に用いるべき指標とベンチマークは何か？
RQ5スパース深層学習を前進させる上で残る未解決問題は何か？

主な発見

スパース手法は、精度のほとんど低下なしにモデルサイズを10〜100倍削減できる。
スパース化は、モバイルにも大規模モデルにも適したメモリ・計算・エネルギー節約の可能性を提供する。
剪定技術は訓練ダイナミクスと関連し、正則化効果やロバスト性効果をもたらす可能性がある。
変分法およびベイズ的視点は、スパース性を誘発し測定するための妥当なアプローチを提供する。
新しい手法の急速な進展があり、公正な比較を可能にする共通のベンチマークの必要性がある。
スパース訓練と成長戦略は、リソース使用を削減しつつ性能を維持、あるいは向上させることができる。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。