QUICK REVIEW

[論文レビュー] Degradation of Feature Space in Continual Learning

Chiara Lanza, Roberto Pereira|arXiv (Cornell University)|Feb 6, 2026

Domain Adaptation and Few-Shot Learning被引用数 0

ひとこと要約

要約: 本論文は特徴空間の等方性を強制することが継続学習(CL)に役立つかを調べ、等方性正則化はCLにおいて一般に性能を低下させることを示し、集中学習と継続学習の幾何学の根本的な差異を浮き彫りにしている。

ABSTRACT

Centralized training is the standard paradigm in deep learning, enabling models to learn from a unified dataset in a single location. In such setup, isotropic feature distributions naturally arise as a mean to support well-structured and generalizable representations. In contrast, continual learning operates on streaming and non-stationary data, and trains models incrementally, inherently facing the well-known plasticity-stability dilemma. In such settings, learning dynamics tends to yield increasingly anisotropic feature space. This arises a fundamental question: should isotropy be enforced to achieve a better balance between stability and plasticity, and thereby mitigate catastrophic forgetting? In this paper, we investigate whether promoting feature-space isotropy can enhance representation quality in continual learning. Through experiments using contrastive continual learning techniques on CIFAR-10 and CIFAR-100 data, we find that isotropic regularization fails to improve, and can in fact degrade, model accuracy in continual settings. Our results highlight essential differences in feature geometry between centralized and continual learning, suggesting that isotropy, while beneficial in centralized setups, may not constitute an appropriate inductive bias for non-stationary learning scenarios.

研究の動機と目的

継続学習における特徴空間の幾何学的プロパティを理解する動機づけ。
等方性がCLにおける安定性-可塑性のトレードオフを促進するかを評価する。
CLにおける対比学習アプローチを等方性と下流正確さの観点で比較する。
CL内での等方性正則化項の導入とその表現・性能への影響を評価する。

提案手法

異なる等方性レベルでデータを模擬する数学的フレームワークを開発する。
3Dから多次元空間への等方性指標を一般化する（IsoEntropyおよび関連概念）。
CIFAR-10/100でいくつかの対比継続学習手法（SupCon、Co2L、SupCP、NCI）を評価する。
CL損失内に等方性正則化項（IsoScore*）を導入してその影響を検討する。
等方性指標を解釈するための実データ実験と並行して合成ベースラインを用いて解釈する。
特徴空間の幾何をマハラノビスの intra/inter-class 距離で分離の指標として評価する。

Figure 1 : t-SNE visualization for CIFAR-10 dataset with centralized learning and three different CL (CO²L) scenarios: $50+50$ (2 experiences of 5 classes each) $40+30+30$ (3 experiences of 4, 3, and 3 classes), $20\times 5$ (5 experiences of 2 classes each).

実験結果

リサーチクエスチョン

RQ1特徴空間の等方性を促進すると、継続学習の表現品質と最終精度は向上するか。
RQ2非定常データストリームの下で、異なるCL手法は学習表現の等方性にどのような影響を与えるか。
RQ3等方性正則化項はCLの下流性能を改善するか、あるいは害するか。
RQ4CL設定における等方性指標と実際の分類精度との関係はどうなるか。

主な発見

Scenario	Method	CIFAR-10 Accuracy (with std)	CIFAR-10 Mahalanobis Dist. (with std)	CIFAR-100 Accuracy (with std)	CIFAR-100 Mahalanobis Dist. (with std)
Centralized	SupCP	94.93 (0.25)	2.48 (0.08)	73.97 (0.20)	2.35 (0.01)
Centralized	SupCon	94.98 (0.07)	10.86 (0.09)	70.24 (0.32)	3.32 (0.05)
50+50	NCI	86.94 (0.31)	1.42 (0.02)	58.79 (0.21)	1.45 (0.01)
50+50	Co 2 L	84.27 (0.16)	2.90 (0.02)	56.37 (0.42)	1.94 (0.02)
40+30+30	SupCP	78.24 (1.21)	1.42 (0.02)	55.89 (0.33)	1.34 (0.01)
40+30+30	SupCon	75.88 (0.39)	2.52 (0.04)	51.65 (0.31)	1.73 (0.02)
40+30+30	NCI	78.83 (0.76)	1.24 (0.02)	50.38 (0.34)	1.14 (0.01)
40+30+30	Co 2 L	77.22 (0.19)	2.03 (0.03)	48.96 (0.33)	1.67 (0.02)
40+30+30	SupCP	64.80 (0.61)	1.16 (0.01)	46.69 (0.39)	1.01 (0.01)
40+30+30	SupCon	62.95 (0.15)	1.96 (0.05)	43.38 (0.60)	1.35 (0.03)
20×5	NCI	72.14 (0.58)	1.06 (0.07)	44.81 (0.40)	1.11 (0.01)
20×5	Co 2 L	70.64 (1.20)	1.40 (0.10)	42.66 (0.65)	1.55 (0.02)
20×5	SupCP	54.47 (3.98)	1.03 (0.07)	41.87 (0.42)	0.99 (0.02)
20×5	SupCon	47.58 (1.68)	1.22 (0.05)	36.19 (0.48)	1.27 (0.02)

経験が増えるにつれてCLでは等方性が低下する傾向があるのに対し、集中学習では表現がより等方的になる。
CLでは等方性正則化が下流の精度を一般に低下させ、CIFAR-10/100全体でときに大きく悪化させる。
蒸留を伴う特定のCL手法（Co2LとNCI）は経験が増えると等方性を高く保つが、等方性が高いことが必ずしも精度の向上と一致しない。
集中学習では、より高い等方性が一部設定で高精度と相関するが、その相関はCL設定では崩れる。
等方性正則化（IsoScore*）はCLにおける等方性指標を高めるが、しばしば精度を低下させ、等方性だけをCLの十分な目的にはできないことを示している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。