QUICK REVIEW

[论文解读] On catastrophic forgetting in Generative Adversarial Networks

Hoang Thanh-Tung, Truyen Tran|arXiv (Cornell University)|Jul 11, 2018

Generative Adversarial Networks and Image Synthesis参考文献 14被引用 5

一句话总结

本文表明，生成对抗网络（GANs）在训练过程中会因模型生成分布的连续性变化而出现灾难性遗忘，导致判别器逐步遗忘真实数据分布。作者将这种遗忘现象与模式崩溃和训练不收敛联系起来，指出只有在最小化遗忘的情况下，真实数据点才会在判别器输出中形成尖锐的局部极大值，并提出了缓解该问题的方法。

ABSTRACT

In this paper, we show that Generative Adversarial Networks (GANs) suffer from catastrophic forgetting even when they are trained to approximate a single target distribution. We show that GAN training is a continual learning problem in which the sequence of changing model distributions is the sequence of tasks to the discriminator. The level of mismatch between tasks in the sequence determines the level of forgetting. Catastrophic forgetting is interrelated to mode collapse and can make the training of GANs non-convergent. We investigate the landscape of the discriminator's output in different variants of GANs and find that when a GAN converges to a good equilibrium, real training datapoints are wide local maxima of the discriminator. We empirically show the relationship between the sharpness of local maxima and mode collapse and generalization in GANs. We show how catastrophic forgetting prevents the discriminator from making real datapoints local maxima, and thus causes non-convergence. Finally, we study methods for preventing catastrophic forgetting in GANs.

研究动机与目标

调查GAN在仅训练单一目标分布的情况下是否仍表现出灾难性遗忘。
分析GAN训练过程中模型分布的演变序列如何构成对判别器的持续学习问题。
理解灾难性遗忘、模式崩溃与GAN训练不收敛之间的关系。
研究判别器输出的损失景观，识别真实数据点成为局部极大值的条件。
开发并评估能够防止GAN中灾难性遗忘的方法。

提出的方法

将GAN训练建模为持续学习过程，其中每个演化后的生成器分布均被视为判别器的新任务。
通过分析不同GAN变体中判别器输出景观，识别对应于真实数据点的局部极大值。
测量真实数据点周围局部极大值的尖锐程度，以关联模式崩溃与泛化性能。
通过实证分析，将判别器中缺乏宽广局部极大值的现象与灾难性遗忘及训练不收敛联系起来。
提出并评估正则化或训练策略，以稳定判别器的学习过程并减少遗忘。

实验结果

研究问题

RQ1即使在单一目标分布上进行训练，GAN中的灾难性遗忘在多大程度上会发生？
RQ2GAN训练过程中生成器分布的演化序列在多大程度上影响判别器对真实数据知识的保留能力？
RQ3判别器输出中局部极大值的尖锐程度与模式崩溃或泛化性能之间存在何种关系？
RQ4为何灾难性遗忘会导致GAN训练不收敛？
RQ5能否设计出防止GAN中灾难性遗忘的方法，并提升训练稳定性？

主要发现

GAN训练本质上涉及灾难性遗忘，因为判别器会因生成器输出的连续性变化而逐步遗忘真实数据分布。
只有在最小化灾难性遗忘的情况下，真实训练数据点才会在判别器输出中形成宽广的局部极大值。
真实数据点周围局部极大值的尖锐程度与GAN中的模式崩溃及泛化性能差强相关。
灾难性遗忘会阻止宽广局部极大值的形成，直接导致训练动力学不收敛。
本文发现，通过减轻遗忘来稳定判别器的学习景观，可实现更稳定且收敛的GAN训练。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。