QUICK REVIEW

[Paper Review] Negative Momentum for Improved Game Dynamics

Gauthier Gidel, Reyhane Askari Hemmat|arXiv (Cornell University)|Jul 12, 2018

Generative Adversarial Networks and Image Synthesis39 references23 citations

TL;DR

This paper proposes negative momentum in alternating gradient descent for training differentiable games, such as GANs, to stabilize training and improve convergence. By combining alternating updates with negative momentum, the method achieves linear convergence in bilinear games and demonstrates superior performance on saturating GANs, outperforming standard approaches with positive or zero momentum.

ABSTRACT

Games generalize the single-objective optimization paradigm by introducing different objective functions for different players. Differentiable games often proceed by simultaneous or alternating gradient updates. In machine learning, games are gaining new importance through formulations like generative adversarial networks (GANs) and actor-critic systems. However, compared to single-objective optimization, game dynamics are more complex and less understood. In this paper, we analyze gradient-based methods with momentum on simple games. We prove that alternating updates are more stable than simultaneous updates. Next, we show both theoretically and empirically that alternating gradient updates with a negative momentum term achieves convergence in a difficult toy adversarial problem, but also on the notoriously difficult to train saturating GANs.

Motivation & Objective

To address instability and non-convergence in training differentiable games like GANs using gradient-based methods.
To investigate how momentum and update order (simultaneous vs. alternating) affect convergence in adversarial games.
To demonstrate that negative momentum in alternating updates stabilizes dynamics and enables convergence in difficult settings.
To provide theoretical guarantees and empirical validation for the proposed method on both toy and real-world GAN benchmarks.

Proposed method

Uses alternating gradient updates with a negative momentum term to counteract oscillatory behavior in adversarial games.
Analyzes the dynamics of the method using linear stability analysis on a bilinear game formulation: min_θ max_φ θᵀAφ.
Derives convergence conditions by studying the eigenvalues of the system's Jacobian matrix under different momentum values.
Applies a state-augmented formulation to model momentum dynamics and proves diagonalizability of the resulting linear operator.
Employs spectral radius analysis to bound the convergence rate and show exponential decay under negative momentum.
Validates results on both synthetic bilinear games and real-world GANs with saturating loss functions.

Experimental results

Research questions

RQ1Can negative momentum in alternating gradient updates stabilize and accelerate convergence in differentiable games?
RQ2Why do positive or zero momentum values fail to converge in bilinear games, while negative momentum succeeds?
RQ3How does the choice between simultaneous and alternating updates affect convergence when momentum is applied?
RQ4What is the theoretical mechanism by which negative momentum improves local convergence in games with high imaginary parts in Jacobian eigenvalues?
RQ5Does negative momentum improve training stability and convergence in practical GANs, especially with saturating loss functions?

Key findings

Negative momentum in alternating updates achieves linear convergence in bilinear games, while positive or zero momentum fails to converge.
The method converges with a rate bounded by O(Δ₀(1 - η²σ²_min(A)/16)^t), showing exponential decay to the optimum.
For simultaneous updates, even negative momentum fails to converge, as the spectral radius remains greater than 1.
Theoretical analysis shows that negative momentum improves stability when Jacobian eigenvalues have large imaginary parts, reducing oscillatory divergence.
Empirical results confirm that negative momentum enables convergence on notoriously difficult saturating GANs, both in toy settings and on real datasets.
The method outperforms standard approaches with positive or zero momentum, particularly in settings where training fails to converge otherwise.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.