Skip to main content
QUICK REVIEW

[论文解读] Least Squares Generative Adversarial Networks

Xudong Mao, Qing Li|arXiv (Cornell University)|Nov 13, 2016
Generative Adversarial Networks and Image Synthesis参考文献 32被引用 161
一句话总结

LSGANs 将判别器的 sigmoid 交叉熵损失 替换 为 最小二乘损失,从而在图像质量和训练稳定性方面得到 提升,并且 它们 与 Pearson chi-squared divergence 相关。

ABSTRACT

Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $χ^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.

研究动机与目标

  • Address vanishing gradient issues in regular GANs during generator updates.
  • Propose a least squares loss formulation for the discriminator in GANs.
  • Show that the LSGAN objective corresponds to minimizing the Pearson chi-squared divergence under certain parameter choices.
  • Present two network architectures for image generation and multi-class tasks (e.g., Chinese characters).
  • Demonstrate improved sample quality and stability over regular GANs across datasets.

提出的方法

  • Adopt a least squares loss for the discriminator with an a-b coding scheme: D training minimizes 1/2 E[(D(x)-b)^2] + 1/2 E[(D(G(z))-a)^2].
  • Generator training minimizes 1/2 E[(D(G(z))-c)^2], pushing generated samples toward the target value c.
  • Show that with b-c=1 and b-a=2, the objective corresponds to Pearson chi-squared divergence between p_data+p_g and 2p_g.
  • Provide two model architectures: one for 112x112 image generation and another conditional LSGAN for many classes (e.g., Chinese characters).
  • Explain parameter selection options (e.g., a=-1, b=1, c=0 or a,b,c with 0-1 coding) and their practical implications.

实验结果

研究问题

  • RQ1Does adopting a least squares loss for the GAN discriminator improve sample quality compared to GANs with sigmoid cross-entropy?
  • RQ2Does LSGAN training exhibit greater stability and reduced mode collapse across datasets?
  • RQ3How is the LSGAN objective related to f-divergences, specifically Pearson chi-squared divergence?
  • RQ4Can LSGANs be extended to conditional settings for multi-class or large-label problems?

主要发现

  • LSGANs generate higher quality images than regular GANs on multiple LSUN scene datasets.
  • LSGANs demonstrate greater training stability and reduced susceptibility to vanishing gradients.
  • LSGANs can converge without batch normalization under certain settings and with appropriate optimizers.
  • In Gaussian mixture experiments, regular GANs exhibit mode collapse while LSGANs learn the full distribution.
  • A conditional LSGAN can generate readable Chinese characters across 3740 classes.
  • Quantitative analysis links the LSGAN objective to minimizing Pearson chi-squared divergence under specified parameter choices.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。