[Paper Review] Continual Learning in Generative Adversarial Nets
The paper introduces a continual learning framework for GANs by applying elastic weight consolidation to G, enabling sequential learning of new data distributions without forgetting previously learned ones. It demonstrates this on class-conditional GANs for MNIST and SVHN.
Developments in deep generative models have allowed for tractable learning of high-dimensional data distributions. While the employed learning procedures typically assume that training data is drawn i.i.d. from the distribution of interest, it may be desirable to model distinct distributions which are observed sequentially, such as when different classes are encountered over time. Although conditional variations of deep generative models permit multiple distributions to be modeled by a single network in a disentangled fashion, they are susceptible to catastrophic forgetting when the distributions are encountered sequentially. In this paper, we adapt recent work in reducing catastrophic forgetting to the task of training generative adversarial networks on a sequence of distinct distributions, enabling continual generative modeling.
Motivation & Objective
- Motivate continual learning for deep generative models under sequential distribution shifts.
- Adapt forgetting-prevention techniques to GANs to avoid retraining from scratch.
- Show that a single generator can model multiple distributions encountered over time without storing all past data.
- Evaluate the approach on MNIST with MLP GAN and SVHN with DCGAN to demonstrate practicality.
Proposed method
- Use an augmented generator objective that penalizes changes to parameters identified as critical for previous tasks via Fisher information.
- Compute empirical Fisher information based on the discriminator’s output to identify salient G parameters.
- Apply an EWC-like quadratic penalty to G's parameters during training on new tasks to preserve prior task performance.
- Operate in a conditional GAN setting to associate each distribution with a distinct conditional input y when possible.
- Treat the approach as scalable to a sequence of tasks with no access to previous data or regenerated past data.
Experimental results
Research questions
- RQ1Can a GAN be trained sequentially on new distributions without access to prior data and without catastrophic forgetting?
- RQ2Does applying an EWC-style penalty to the generator preserve earlier learned distributions while learning new ones?
- RQ3Is the approach effective in class-conditional GANs across different datasets (MNIST and SVHN)?
Key findings
- The augmented objective with Fisher-based penalties mitigates forgetting compared to standard GAN training.
- In MNIST with an MLP GAN, the approach prevents forgetting when learning new digits sequentially.
- In SVHN with a DCGAN, the approach preserves previously learned digits while adding new ones.
- The method shows robustness to a range of lambda values for the penalty parameter, maintaining visual fidelity and diversity.
- Results indicate that using a conditional framework helps maintain a stable mapping from (z,y) to data as tasks change over time.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.