[论文解读] The Numerics of GANs
该论文分析了为何GAN训练经常因为梯度雅可比矩阵的特征值而难以收敛,并引入 Consensus Optimization 在不同架构和发散度度量下稳定训练。
In this paper, we analyze the numerics of common algorithms for training Generative Adversarial Networks (GANs). Using the formalism of smooth two-player games we analyze the associated gradient vector field of GAN training objectives. Our findings suggest that the convergence of current algorithms suffers due to two factors: i) presence of eigenvalues of the Jacobian of the gradient vector field with zero real-part, and ii) eigenvalues with big imaginary part. Using these findings, we design a new algorithm that overcomes some of these limitations and has better convergence properties. Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.
研究动机与目标
- Identify why simultaneous gradient ascent struggles to find local Nash equilibria in GANs.
- Model the GAN training as a smooth two-player game and study the gradient vector field.
- Propose a robust optimization method to improve convergence and stability.
- Empirically validate the method on common GAN architectures and divergences.
提出的方法
- Frame GAN training as a smooth two-player game with a gradient vector field v(x).
- Analyze the Jacobian of v(x) to identify causes of non-convergence, including zero-real-part eigenvalues and large imaginary parts.
- Introduce a modified vector field w(x)=v(x) - γ ∇L(x) with L(x)=½||v(x)||² to obtain better convergence properties.
- Derive Consensus Optimization (Algorithm 2) where the modified game uses tilde f and tilde g with a regularizer L.
- Provide convergence results showing local convergence to a local Nash equilibrium under certain conditions (negative semi-definite v′(x) and appropriate γ, h).
- Demonstrate that the eigenvalue spectrum is shifted left, improving stability, and relate this to second-order/implicit-Euler interpretations.
实验结果
研究问题
- RQ1What specifically causes simultaneous gradient ascent to fail to converge to local Nash equilibria in GANs?
- RQ2How can the gradient dynamics be adapted to ensure robust convergence in two-player GAN games?
- RQ3Does the proposed consensus optimization approach improve stability and convergence across different GAN architectures and divergence measures?
主要发现
- Eigenvalues of the Jacobian with zero real part and large imaginary parts hinder convergence of SimGA in GANs.
- Consensus Optimization moves eigenvalues to the left in the complex plane, improving stability and allowing reasonable step sizes.
- The method yields stable training on CIFAR-10 and CelebA with architectures known to be hard to train by standard methods.
- Training with Consensus Optimization gives more stable generator/discriminator losses and competitive inception scores compared with alternating gradient ascent.
- Consensus optimization is compatible with various GAN architectures and divergence measures, functioning as a numerically robust alternative to standard methods.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。