QUICK REVIEW

[論文レビュー] SIESTA: Efficient Online Continual Learning with Sleep

Md Yousuf Harun, Jhair Gallardo|arXiv (Cornell University)|Mar 19, 2023

Domain Adaptation and Few-Shot Learning被引用数 7

ひとこと要約

SIESTA は wake/sleep online continual learning を用いて offline memory consolidation を行い、オンライン phase の出力層のみを更新し sleep 中に memory を統合することで、計算量とメモリを大幅に削減しつつ ImageNet-1K で offline に近い性能を達成する。

ABSTRACT

In supervised continual learning, a deep neural network (DNN) is updated with an ever-growing data stream. Unlike the offline setting where data is shuffled, we cannot make any distributional assumptions about the data stream. Ideally, only one pass through the dataset is needed for computational efficiency. However, existing methods are inadequate and make many assumptions that cannot be made for real-world applications, while simultaneously failing to improve computational efficiency. In this paper, we propose a novel continual learning method, SIESTA based on wake/sleep framework for training, which is well aligned to the needs of on-device learning. The major goal of SIESTA is to advance compute efficient continual learning so that DNNs can be updated efficiently using far less time and energy. The principal innovations of SIESTA are: 1) rapid online updates using a rehearsal-free, backpropagation-free, and data-driven network update rule during its wake phase, and 2) expedited memory consolidation using a compute-restricted rehearsal policy during its sleep phase. For memory efficiency, SIESTA adapts latent rehearsal using memory indexing from REMIND. Compared to REMIND and prior arts, SIESTA is far more computationally efficient, enabling continual learning on ImageNet-1K in under 2 hours on a single GPU; moreover, in the augmentation-free setting it matches the performance of the offline learner, a milestone critical to driving adoption of continual learning in real-world applications.

研究の動機と目的

監督あり continual learning のための offline memory consolidation を用いたオンライン更新の形式化。
計算量・メモリ制約の下で迅速なオンライン更新と memory consolidation を可能にする wake/sleep アルゴリズム（SIESTA）の開発。
latent rehearsal と memory indexing を活用して rehearsal 時の memory 効率を向上。
augmentation なしで ImageNet-1K および他データセットでの SIESTA の効率性と性能を実証。
任意のデータ順序性に対する頑健性を示し、augmentation-free 設定で忘却ゼロを達成。

提案手法

2 段階学習: wake phase は出力層の軽量なオンライン更新を実行し、クラス平均の推移を用いて更新; sleep phase は G と F の rehearsal ベースの offline 更新を実施する一方、H は固定。
メモリ効率の高い latent rehearsal を、事前学習データで学習した Product Quantization (PQ) により量子化された中間表現を保存して実現（PQ は Z の再構成を可能にする）。
分類は学習された温度を用いた cosinesoftmax を用いてクラス得点を計算。
オンライン更新時の出力層更新は f_k <- (c_k f_k + z_t) / (c_k + 1) で、クラスカウンタ c_k を用いる。
睡眠相の rehearsal は z 表現の格納済み minibatch を選択し、H を固定したまま最大 m 回の勾配更新で G と F をバックプロパゲーションにより更新。
ネットワークアーキテクチャは MobileNetV3-L を用い、H を最初の 8 層、G および F を上部層とする。PQ は FAISS 由来で Z をメモリ効率化のため圧縮。

実験結果

リサーチクエスチョン

RQ1 wake phase のオンライン更新を rehearsal なしで実現し、効率的な continual learning を達成できるか。
RQ2 latent rehearsal による sleep での offline memory consolidation が、大規模データセットで最先端の continual learner に対して競争力または優れた性能を達成できるか。
RQ3 task label なしで iid および class incremental の任意データ順序に対して SIESTA は性能を維持できるか。
RQ4 ImageNet-1K および他データセットで、既存の continual learning 法と比較して memory と compute の効率はどうか。

主な発見

Method	P (M)	μ (top-5 %)	α (top-5 %)	M (GB)	U (M)	GFLOPS (↑)
Offline	5.48	—	83.31	192.87	768.70	—
DER	54.80	81.87	70.15	20.99	12.43	7944.60
ER	5.48	76.32	63.92	19.59	11.53	1294.10
REMIND	5.48	81.77	74.31	2.02	11.53	10139.00
SIESTA	5.48	88.33	83.59	2.02	11.53	19326.00

SIESTA は augmentation-free 設定で ImageNet-1K の offline 学習者と同等の性能を達成し、offline モデルに対する忘却ゼロを実現。
SIESTA ははるかに少ないパラメータとメモリを使用し、競合手法より大幅に少ない更新回数を必要とする（例：パラメータ 11.68–116.89M；メモリ 19–22 GB； baseline の更新回数 11.53M；報告セットアップでは SIESTA が 2.02e7 更新を達成）。
augmentation 設定では、SIESTA は最終精度で DER、ER、REMIND を大きく上回る（それぞれ +15.18、+15.78、+4.03 ポイント）。
augmentation なしでも、SIESTA は 2 時間未満で ImageNet-1K 上の学習が可能で、競合手法よりはるかに高速。
Sleep による offline consolidation は sleep 後の精度を一貫して改善（平均で sleep サイクルごとに約 4.25% 絶対的改善）。
SIESTA はデータ順序（iid vs class incremental）に対して頑健で、主要設定で offline モデルと有意差がない結果を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。