QUICK REVIEW

[论文解读] When can Wasserstein GANs minimize Wasserstein Distance

Yuanzhi Li, Zehao Dou|arXiv (Cornell University)|Mar 9, 2020

Generative Adversarial Networks and Image Synthesis参考文献 77被引用 6

一句话总结

本文建立了Wasserstein GAN在何种理论条件下能够最小化与真实数据分布之间的Wasserstein距离。研究表明，当生成器为两层ReLU网络时，单层ReLU判别器既是必要条件也是充分条件，使得生成器在使用多项式数量的训练样本下，收敛至与真实分布逆多项式接近的分布。

ABSTRACT

Generative Adversarial Networks (GANs) are widely used models to learn complex real-world distributions. In GANs, the training of the generator usually stops when the discriminator can no longer distinguish the generator's output from the set of training examples. A central question of GANs is that when the training stops, whether the generated distribution is actually close to the target distribution. Previously, it was found that such closeness can only be achieved when there is a strict capacity trade-off between the generator and discriminator: Neither of the two models can be too powerful than the other. In this paper, we established one of the first theoretical results in explaining this trade-off. We show that when the generator is a class of two-layer neural networks, then it is necessary and sufficient for the discriminator to be a one-layer network with ReLU-type activation functions. With this trade-off, using polynomially many training examples, when the training stops, the generator will indeed output a distribution that is inverse-polynomially close to the target. Our result also sheds light on how GANs training can find such a generator efficiently.

研究动机与目标

从理论上解释Wasserstein GAN中生成器与判别器之间的容量权衡。
识别生成器输出分布接近真实数据分布的条件。
为判别器架构建立收敛至近似最优生成器的必要且充分条件。
为具有多项式样本复杂度的WGAN高效训练提供理论基础。

提出的方法

将生成器分析为具有ReLU激活函数的两层神经网络。
要求判别器为单层ReLU网络，以确保最优收敛。
在这些网络架构约束下，对Wasserstein距离最小化进行理论分析。
证明多项式样本复杂度足以实现与真实分布的逆多项式近似。
应用优化与泛化理论中的工具，以界定生成分布与目标分布之间的距离。

实验结果

研究问题

RQ1在何种网络架构条件下，Wasserstein GAN能够最小化与真实数据分布之间的Wasserstein距离？
RQ2是否存在判别器容量相对于生成器容量的必要且充分条件，以实现对近似解的收敛？
RQ3生成器能否仅使用多项式数量的训练样本，实现与目标分布的逆多项式接近？
RQ4激活函数的选择与网络深度如何影响WGAN的收敛性？

主要发现

当生成器为两层ReLU网络时，单层ReLU判别器既是必要条件也是充分条件。
当训练停止时，生成器收敛至与真实数据分布逆多项式接近的分布。
多项式样本复杂度足以实现此类近似水平。
该理论框架解释了为何生成器与判别器之间的容量平衡对收敛至关重要。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。