QUICK REVIEW

[Paper Review] Continual learning: A comparative study on how to defy forgetting in classification tasks.

Matthias De Lange, Rahaf Aljundi|arXiv (Cornell University)|Sep 18, 2019

Domain Adaptation and Few-Shot Learning162 citations

TL;DR

This paper presents a comprehensive study on continual learning for task-incremental image classification, introducing a novel framework to dynamically balance stability and plasticity. It evaluates 10 state-of-the-art methods and baselines on Tiny ImageNet and iNaturalist, demonstrating that model capacity, regularization, and task order significantly impact performance, with key insights into memory, computation, and generalization trade-offs.

ABSTRACT

Artificial neural networks thrive in solving the classification problem for a particular rigid task, where the network resembles a static entity of knowledge, acquired through generalized learning behaviour from a distinct training phase. However, endeavours to extend this knowledge without targeting the original task usually result in a catastrophic forgetting of this task. Continual learning shifts this paradigm towards a network that can continually accumulate knowledge over different tasks without the need for retraining from scratch, with methods in particular aiming to alleviate forgetting. We focus on task-incremental classification, where tasks arrive in a batch-like fashion, and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 10 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize which method performs best, both on balanced Tiny Imagenet and a large-scale unbalanced iNaturalist datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.

Motivation & Objective

To address catastrophic forgetting in continual learning by developing a framework that dynamically manages the stability-plasticity trade-off.
To provide a comprehensive taxonomy and overview of state-of-the-art continual learning methods for task-incremental learning.
To empirically evaluate 10 SOTA continual learning methods and baselines on balanced and unbalanced image classification benchmarks.
To investigate the impact of model capacity, weight decay, dropout, and task ordering on continual learning performance.
To qualitatively compare methods in terms of memory usage, computation time, and storage requirements.

Proposed method

Proposes a novel framework to continuously assess and adjust the stability-plasticity trade-off during continual learning, enabling adaptive learning behavior.
Employs a systematic experimental setup on two large-scale datasets: balanced Tiny ImageNet and unbalanced iNaturalist, with controlled task sequences.
Applies standard regularization techniques such as weight decay and dropout to analyze their influence on forgetting and accuracy.
Uses a batch-like task incrementation protocol where tasks are presented with clear boundaries, simulating real-world continual learning scenarios.
Introduces a unified evaluation protocol to compare methods across metrics including average accuracy, forgetting rate, and forward transfer.
Conducts ablation studies on model capacity and task order to assess robustness and generalization across different settings.

Experimental results

Research questions

RQ1Which continual learning method achieves the highest average accuracy across both balanced and unbalanced image classification benchmarks?
RQ2How do model capacity, weight decay, and dropout regularization affect the performance and forgetting behavior of continual learning models?
RQ3What is the impact of task ordering on the stability and plasticity of continual learning systems?
RQ4How do different methods compare in terms of memory footprint, computation time, and storage requirements?
RQ5Can a dynamic stability-plasticity trade-off mechanism improve continual learning performance compared to static or fixed methods?

Key findings

The proposed framework enables dynamic adjustment of the stability-plasticity trade-off, leading to improved generalization and reduced forgetting across tasks.
On Tiny ImageNet, methods with effective regularization and higher model capacity achieved significantly higher average accuracy, with gains up to 15% over baselines.
On iNaturalist, task order had a pronounced effect, with certain methods showing up to 20% performance drop when tasks were presented in suboptimal sequences.
Weight decay and dropout regularization were found to be critical in mitigating forgetting, especially in unbalanced data settings.
Methods with lower memory and computation overhead showed better scalability, though often at the cost of accuracy, highlighting a key trade-off.
Forward transfer was observed to be highly sensitive to model capacity and regularization, with larger models showing stronger positive transfer across tasks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.