[论文解读] Multi-Generator Generative Adversarial Nets
MGAN 通过一个包含分类器和判别器的多个生成器混合来覆盖多种数据模态并避免模式崩溃,在大规模数据集上实现了最先进的 Inception 分数。
We propose a new approach to train the Generative Adversarial Nets (GANs) with a mixture of generators to overcome the mode collapsing problem. The main intuition is to employ multiple generators, instead of using a single one as in the original GAN. The idea is simple, yet proven to be extremely effective at covering diverse data modes, easily overcoming the mode collapse and delivering state-of-the-art results. A minimax formulation is able to establish among a classifier, a discriminator, and a set of generators in a similar spirit with GAN. Generators create samples that are intended to come from the same distribution as the training data, whilst the discriminator determines whether samples are true data or generated by generators, and the classifier specifies which generator a sample comes from. The distinguishing feature is that internal samples are created from multiple generators, and then one of them will be randomly selected as final output similar to the mechanism of a probabilistic mixture model. We term our method Mixture GAN (MGAN). We develop theoretical analysis to prove that, at the equilibrium, the Jensen-Shannon divergence (JSD) between the mixture of generators' distributions and the empirical data distribution is minimal, whilst the JSD among generators' distributions is maximal, hence effectively avoiding the mode collapse. By utilizing parameter sharing, our proposed model adds minimal computational cost to the standard GAN, and thus can also efficiently scale to large-scale datasets. We conduct extensive experiments on synthetic 2D data and natural image databases (CIFAR-10, STL-10 and ImageNet) to demonstrate the superior performance of our MGAN in achieving state-of-the-art Inception scores over latest baselines, generating diverse and appealing recognizable objects at different resolutions, and specializing in capturing different types of objects by generators.
研究动机与目标
- Motivation to overcome mode collapse in GANs.
- Propose a mixture of generators to cover diverse data modes.
- Theoretical analysis showing JSD minimization for data-model and maximization among generators.
- Efficient training via parameter sharing to scale on large datasets.
- Empirical evaluation on synthetic data and real-world image datasets (CIFAR-10, STL-10, ImageNet).
提出的方法
- Formulate MGAN as a minimax game among K generators, a discriminator, and a classifier.
- Output is a mixture sample: select G_u(z) where u ~ Mult(pi).
- Objective includes standard GAN terms plus a diversity term -beta sum_k pi_k E_{x~P_{G_k}}[log C_k(x)].
- Share parameters among generators and between discriminator and classifier to reduce cost.
- Use non-saturating GAN training and fixed mixture weights pi (often uniform).
- Provide theoretical results showing optimal C*, D* and G* maximize generator diversity while minimizing data-model divergence.
实验结果
研究问题
- RQ1Can a mixture of generators with a classifier effectively cover multiple data modes and avoid mode collapse?
- RQ2Does optimizing the MGAN objective minimize Jensen-Shannon Divergence between data and model while maximizing divergence among generators?
- RQ3Is parameter sharing sufficient to make multi-generator MGAN scalable to large datasets without prohibitive cost?
- RQ4Do MGANs achieve superior quantitative metrics (Inception score) on CIFAR-10, STL-10, and ImageNet compared to single-generator GANs and other multi-generator approaches?
主要发现
| 模型 | CIFAR-10 | STL-10 | ImageNet |
|---|---|---|---|
| 真实数据 | 11.24 ± 0.16 | 26.08 ± 0.26 | 25.78 ± 0.47 |
| WGAN (Arjovsky et al., 2017) | 3.82 ± 0.06 | – | – |
| MIX+WGAN (Arora et al., 2017) | 4.04 ± 0.07 | – | – |
| Improved-GAN (Salimans et al., 2016) | 4.36 ± 0.04 | – | – |
| ALI (Dumoulin et al., 2016) | 5.34 ± 0.05 | – | – |
| BEGAN (Berthelot et al., 2017) | 5.62 | – | – |
| MAGAN (Wang et al., 2017) | 5.67 | – | – |
| GMAN (Durugkar et al., 2016) | 6.00 ± 0.19 | – | – |
| DCGAN (Radford et al., 2015) | 6.40 ± 0.05 | 7.54 | 7.89 |
| DFM (Warde-Farley & Bengio, 2016) | 7.72 ± 0.13 | 8.51 ± 0.13 | 9.18 ± 0.13 |
| D2GAN (Nguyen et al., 2017) | 7.15 ± 0.07 | 7.98 | 8.25 |
| MGAN | 8.33 ± 0.10 | 9.22 ± 0.11 | 9.32 ± 0.10 |
- MGAN achieves state-of-the-art Inception scores on CIFAR-10 (8.33 ± 0.10), STL-10 (9.22 ± 0.11), and ImageNet (9.32 ± 0.10) in unsupervised training.
- Each generator specializes to generate samples from different data modes, effectively covering diverse object types.
- Theoretical results show equilibrium minimizes JSD between data and the mixture model while maximizing JSD among generators.
- Parameter sharing across generators and between the discriminator and classifier yields efficiency and scalability with minimal extra cost.
- MGAN demonstrates faster and more stable convergence on synthetic data and scales to large natural image datasets with strong qualitative and quantitative results.
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。