[Paper Review] Least Squares Generative Adversarial Networks
LSGANs replace the sigmoid cross-entropy loss in GANs with a least-squares loss for the discriminator, yielding better image quality and more stable training, and they relate to Pearson chi-squared divergence.
Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $χ^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.
Motivation & Objective
- Address vanishing gradient issues in regular GANs during generator updates.
- Propose a least squares loss formulation for the discriminator in GANs.
- Show that the LSGAN objective corresponds to minimizing the Pearson chi-squared divergence under certain parameter choices.
- Present two network architectures for image generation and multi-class tasks (e.g., Chinese characters).
- Demonstrate improved sample quality and stability over regular GANs across datasets.
Proposed method
- Adopt a least squares loss for the discriminator with an a-b coding scheme: D training minimizes 1/2 E[(D(x)-b)^2] + 1/2 E[(D(G(z))-a)^2].
- Generator training minimizes 1/2 E[(D(G(z))-c)^2], pushing generated samples toward the target value c.
- Show that with b-c=1 and b-a=2, the objective corresponds to Pearson chi-squared divergence between p_data+p_g and 2p_g.
- Provide two model architectures: one for 112x112 image generation and another conditional LSGAN for many classes (e.g., Chinese characters).
- Explain parameter selection options (e.g., a=-1, b=1, c=0 or a,b,c with 0-1 coding) and their practical implications.
Experimental results
Research questions
- RQ1Does adopting a least squares loss for the GAN discriminator improve sample quality compared to GANs with sigmoid cross-entropy?
- RQ2Does LSGAN training exhibit greater stability and reduced mode collapse across datasets?
- RQ3How is the LSGAN objective related to f-divergences, specifically Pearson chi-squared divergence?
- RQ4Can LSGANs be extended to conditional settings for multi-class or large-label problems?
Key findings
- LSGANs generate higher quality images than regular GANs on multiple LSUN scene datasets.
- LSGANs demonstrate greater training stability and reduced susceptibility to vanishing gradients.
- LSGANs can converge without batch normalization under certain settings and with appropriate optimizers.
- In Gaussian mixture experiments, regular GANs exhibit mode collapse while LSGANs learn the full distribution.
- A conditional LSGAN can generate readable Chinese characters across 3740 classes.
- Quantitative analysis links the LSGAN objective to minimizing Pearson chi-squared divergence under specified parameter choices.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.