QUICK REVIEW

[論文レビュー] Compacting, Picking and Growing for Unforgetting Continual Learning

Steven C. Y. Hung, Cheng-Hao Tu|arXiv (Cornell University)|Oct 15, 2019

Domain Adaptation and Few-Shot Learning参考文献 45被引用数 134

ひとこと要約

この論文は CPG を提示する。継続学習フレームワークは pruning によってモデルを圧縮し、 differentiable mask で古い重みを重要に選択し、必要に応じてのみネットワークを成長させることで、多くのタスクにわたって忘れを防ぎつつコンパクトな成長を実現する。いくつかのベースラインを上回り、将来のタスクのためのコンパクトな知識ベースを維持する。

ABSTRACT

Continual lifelong learning is essential to many applications. In this paper, we propose a simple but effective approach to continual deep learning. Our approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. By enforcing their integration in an iterative manner, we introduce an incremental learning method that is scalable to the number of sequential tasks in a continual learning process. Our approach is easy to implement and owns several favorable characteristics. First, it can avoid forgetting (i.e., learn new tasks while remembering all previous tasks). Second, it allows model expansion but can maintain the model compactness when handling sequential tasks. Besides, through our compaction and selection/expansion mechanism, we show that the knowledge accumulated through learning previous tasks is helpful to build a better model for the new tasks compared to training the models independently with tasks. Experimental results show that our approach can incrementally learn a deep model tackling multiple tasks without forgetting, while the model compactness is maintained with the performance more satisfiable than individual task training.

研究の動機と目的

連続的な生涯学習を動機づけ、連続的タスクの間で崩壊的忘却を回避しつつスケーラビリティを維持する。
モデル圧縮、重要重みの選択、段階的ネットワーク拡張を組み合わせたシンプルで効果的なフレームワークを提案する。
past tasks の知識を再利用することで新しいタスクの学習が独立に訓練する場合より改善されることを示す。
アプローチが無制限の連続タスクを支えつつモデルサイズをコンパクトに保てることを示す。

提案手法

現在のタスクのモデルを性能を保ちながら徐々に剪定して圧縮する。
新タスクのために再利用する古いタスクの重みを選ぶ学習可能なバイナリマスクを導入する。
新タスクのために解放された（余分な）重みを再利用し、精度目標を満たさない場合はアーキテクチャを拡張する。
新しいタスクを学習する間、忘れを防ぐために古いタスクの重みを固定し、選択マスクと他の解放重みと共に新タスクの重みを訓練する。
新しいタスクの訓練後、そのタスクのために新たに追加された重みをさらに剪定してコンパクトな表現を得る。
以降のタスクに対して、圧縮、選択、拡張を反復的に繰り返す。

実験結果

リサーチクエスチョン

RQ1コンパクト化-選択-成長の循環は忘却を防ぎつつ無限のタスク列にわたるスケーラブルな成長を可能にするか。
RQ2学習可能マスクを介して倫理的にコンパクトな古いタスクの重みセットを再利用することは、新規タスクのパフォーマンスをゼロからの訓練や完全共有と比較して改善するか。
RQ3提案手法は関連する継続学習アプローチ（例：ProgressiveNet、PackNet、DEN）と精度およびモデルサイズの点でどう比較されるか。
RQ4過剰な成長なしにターゲット精度を達成するために必要なアーキテクチャの拡張量はどの程度か。
RQ5学習された知識ベースは、独立してタスクを訓練した場合と比較して将来のタスクの性能に有益か。

主な発見

方法	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	Avg.	Exp.	Red.
PackNet	66.4	80.0	76.2	78.4	80.0	79.8	67.8	61.4	68.8	77.2	79.0	59.4	66.4	57.2	36.0	54.2	51.6	58.8	67.8	83.2	67.5	1	0
PAE	67.2	77.0	78.6	76.0	84.4	81.2	77.6	80.0	80.4	87.8	85.4	77.8	79.4	79.6	51.2	68.4	68.6	68.6	83.2	88.8	77.1	2	0
CPG	65.2	76.6	79.8	81.4	86.6	84.8	83.4	85.0	87.2	89.2	90.8	82.4	85.6	85.2	53.2	74.4	70.0	73.4	88.8	94.8	80.9	1.5	0.41

CPG は新しいタスクを逐次的に学習しながら正確な旧タスクの性能を維持する。
ベースラインと比較して、CPG は精度がより良いまたは同等で、モデルをコンパクトに保ちつつ控えめな拡張を実現する。
重要重みマスクを使用することで不必要な旧タスク重みを削減し、次のタスクのパフォーマンスが向上する。
CPG はいくつかのベースライン（例：DEN、ProgressiveNet）よりも拡張が小さい一方で、複数タスクに渡って精度を維持または改善する。
このアプローチは将来のタスク学習を以前のタスクと独立して訓練した場合と比べて再利用可能な知識ベースを構築し、学習を強化する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。