QUICK REVIEW

[论文解读] Do GANs actually learn the distribution? An empirical study

Sanjeev Arora, Yi Zhang|arXiv (Cornell University)|Jun 26, 2017

Generative Adversarial Networks and Image Synthesis参考文献 12被引用 130

一句话总结

本文提出基于生日悖论的测试，用于估计 GAN 生成分布的支持大小，并显示若干知名 GAN 产生的分布具有较小的支持，表明它们可能未学到目标分布。它还考察判别器容量如何影响多样性并比较GAN变体。

ABSTRACT

Do GANS (Generative Adversarial Nets) actually learn the target distribution? The foundational paper of (Goodfellow et al 2014) suggested they do, if they were given sufficiently large deep nets, sample size, and computation time. A recent theoretical analysis in Arora et al (to appear at ICML 2017) raised doubts whether the same holds when discriminator has finite size. It showed that the training objective can approach its optimum value even if the generated distribution has very low support ---in other words, the training objective is unable to prevent mode collapse. The current note reports experiments suggesting that such problems are not merely theoretical. It presents empirical evidence that well-known GANs approaches do learn distributions of fairly low support, and thus presumably are not learning the target distribution. The main technical contribution is a new proposed test, based upon the famous birthday paradox, for estimating the support size of the generated distribution.

研究动机与目标

在实际约束下激发问题：GAN 是否学习目标分布。
提出基于生日悖论的分布支持定量测试。
在 CelebA 和 CIFAR-10 上对若干 GAN 架构进行实证评估，以评估多样性和模式崩溃。
探究判别器规模如何影响学习分布的多样性。

提出的方法

定义一个基于生日悖论的测试来估计分布的支持大小。
生成一批样本并识别近重复对作为潜在碰撞。
通过视觉检查候选重复以推断重复的存在并估计支持大小。
对 CelebA 和 CIFAR-10 应用该测试，使用不同的 GAN 变体（DCGAN、MIX+DCGAN、ALI/BiGAN、Stacked GAN）。
在实验中考察判别器容量如何影响观测到的多样性。
讨论对 VAE 和 LSUN-bedroom 数据的适用性。

实验结果

研究问题

RQ1GAN 在实际样本规模下是否产生近似目标分布的大规模支持？
RQ2判别器大小如何影响学习分布的多样性（支持大小）？
RQ3双向 GAN 变体（ALI/BiGAN）是否比标准 GAN 呈现出更高的多样性？
RQ4基于生日悖论的测试是否能揭示在真实数据集上训练的 GAN 的模式崩溃或有限多样性？
RQ5将生日悖论测试应用于连续高维图像数据时的局限性有哪些？

主要发现

在 CelebA 上使用常用结构的 GAN 对批量 ~400 样本中出现重复，概率 ≥50%，意味支持大小约为 160,000 或更少。
ALI/BiGAN 展现出更高的多样性，碰撞出现在约 1000 的批量大小，暗示比 DCGAN/MIX+DCGAN 更大但仍受限的支持（约一百万）。
增加判别器容量往往增加观测到的多样性，在检测到的支持大小几乎线性增长后趋于平缓。
在 CIFAR-10 的堆叠 GAN 上，在各自类别内的重复在不同批量大小出现，表明每类别的多样性受限，而非覆盖整个数据集。
在 LSUN-bedroom 数据上，观察到的近重复通常对应损坏/噪声模式，提示该测试可能被伪影混淆，且分布对噪声有非平凡质量。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。