QUICK REVIEW

[论文解读] GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

Martin Heusel, Hubert Ramsauer|arXiv (Cornell University)|Jun 26, 2017

Generative Adversarial Networks and Image Synthesis被引用 4,490

一句话总结

引入两时间尺度更新规则（TTUR）用于GAN，区分判别器与生成器的学习率，证明收敛到局部纳什均衡，分析 Adam 为带摩擦的动量-球，提出用于评估的 Fréchet Inception Distance (FID)；TTUR 在多个数据集上提升GAN性能。

ABSTRACT

Generative Adversarial Networks (GANs) excel at creating realistic images with complex models for which maximum likelihood is infeasible. However, the convergence of GAN training has still not been proved. We propose a two time-scale update rule (TTUR) for training GANs with stochastic gradient descent on arbitrary GAN loss functions. TTUR has an individual learning rate for both the discriminator and the generator. Using the theory of stochastic approximation, we prove that the TTUR converges under mild assumptions to a stationary local Nash equilibrium. The convergence carries over to the popular Adam optimization, for which we prove that it follows the dynamics of a heavy ball with friction and thus prefers flat minima in the objective landscape. For the evaluation of the performance of GANs at image generation, we introduce the "Fréchet Inception Distance" (FID) which captures the similarity of generated images to real ones better than the Inception Score. In experiments, TTUR improves learning for DCGANs and Improved Wasserstein GANs (WGAN-GP) outperforming conventional GAN training on CelebA, CIFAR-10, SVHN, LSUN Bedrooms, and the One Billion Word Benchmark.

研究动机与目标

动机与解决 GAN 训练中缺乏收敛性保证的问题。
提出一个两时间尺度更新规则（TTUR），对判别器和生成器使用分离的学习率。
给出在 TTUR 下对稳定局部纳什均衡的理论收敛结果，并将 Adam 的动力学分析为带摩擦的重球。
引入 Fréchet Inception Distance (FID) 作为鲁棒的 GAN 评估指标。
在多种 GAN 变体和数据集上对 TTUR 进行实证验证。

提出的方法

用判别器的学习率 b(n) 与生成器的学习率 a(n) 定义 TTUR，并推导随机逼近更新。
在温和假设下，利用常微分方程（ODE）与随机逼近理论证明 TTUR 收敛到稳定的局部纳什均衡。
将 Adam 描述为带摩擦的重球（HBF），并将其与 TTUR 收敛联系起来。
通过将其动态与 HBF 的 ODEs 联系起来，显示带 TTUR 的 Adam 保持收敛。
引入 Fréchet Inception Distance (FID) 作为评估生成数据在 Inception 编码空间中的均值和协方差下与真实数据的接近程度的一种手段。
在 DCGAN、WGAN-GP（图像和语言数据）以及 One Billion Word Benchmark 上实验，将 TTUR 与标准单一时间尺度训练进行比较。

实验结果

研究问题

RQ1在随机梯度更新下，用 TTUR 训练的 GAN 能否收敛到稳定的局部纳什均衡？
RQ2判别器和生成器的分离学习率如何影响收敛性和性能，相对于单一时间尺度训练？
RQ3从收敛性的角度看，Adam 是否与 TTUR 兼容，其动力学如何影响极小值质量？
RQ4在不同扰动和数据集下，FID 是否比 Inception Score 提供更可靠的 GAN 质量评估？
RQ5TTUR 训练的 GAN 是否在图像和语言基准上优于传统训练？

主要发现

在温和假设下，TTUR 收敛到稳定的局部纳什均衡。
Adam 可被解释为带摩擦的重球，在 TTUR 下其动力学保持收敛。
TTUR 与 Adam 结合可实现收敛并倾向于平坦极小值。
TTUR 在 CelebA、CIFAR-10、SVHN、LSUN Bedrooms 以及 One Billion Word 基准测试上持续优于标准单一时间尺度的 GAN 训练。
FID 与数据扰动和人类判断的相关性优于 Inception Score，展示了更稳健的 GAN 评估。
TTUR 减少了训练振荡与方差，使学习比原始方法更稳定。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。