QUICK REVIEW

[论文解读] Gaussian Mixture Generative Adversarial Networks for Diverse Datasets, and the Unsupervised Clustering of Images

Matan Ben-Yosef, Daphna Weinshall|arXiv (Cornell University)|Aug 30, 2018

Generative Adversarial Networks and Image Synthesis参考文献 7被引用 41

一句话总结

GM-GANs 用高斯混合代替单峰潜在先验，以更好拟合多样数据，从而提升样本质量/多样性并实现无监督聚类；它们还支持用于条件生成的有监督变体，以及训练后质量-多样性控制。

ABSTRACT

Generative Adversarial Networks (GANs) have been shown to produce realistically looking synthetic images with remarkable success, yet their performance seems less impressive when the training set is highly diverse. In order to provide a better fit to the target data distribution when the dataset includes many different classes, we propose a variant of the basic GAN model, called Gaussian Mixture GAN (GM-GAN), where the probability distribution over the latent space is a mixture of Gaussians. We also propose a supervised variant which is capable of conditional sample synthesis. In order to evaluate the model's performance, we propose a new scoring method which separately takes into account two (typically conflicting) measures - diversity vs. quality of the generated data. Through a series of empirical experiments, using both synthetic and real-world datasets, we quantitatively show that GM-GANs outperform baselines, both when evaluated using the commonly used Inception Score, and when evaluated using our own alternative scoring method. In addition, we qualitatively demonstrate how the extit{unsupervised} variant of GM-GAN tends to map latent vectors sampled from different Gaussians in the latent space to samples of different classes in the data space. We show how this phenomenon can be exploited for the task of unsupervised clustering, and provide quantitative evaluation showing the superiority of our method for the unsupervised clustering of image datasets. Finally, we demonstrate a feature which further sets our model apart from other GAN models: the option to control the quality-diversity trade-off by altering, post-training, the probability distribution of the latent space. This allows one to sample higher quality and lower diversity samples, or vice versa, according to one's needs.

研究动机与目标

Motivate GANs to better handle highly diverse datasets with multi-class, multi-modal structure.
Propose Gaussian Mixture GAN (GM-GAN) with a latent-space mixture of Gaussians to match data sparsity and multi-modality.
Enable supervised/conditional generation via a discriminator that outputs class probabilities.
Introduce a new evaluation score addressing the quality-diversity trade-off beyond Inception Score.
Demonstrate empirically that GM-GANs outperform baselines on synthetic and real datasets, and enable unsupervised clustering.

提出的方法

Define p_Z as a mixture of K Gaussians with parameters {μ_k, Σ_k} and mixture weights α_k; consider static (fixed) and dynamic (learned) GM-GAN variants.
Implement a GM-GAN where z|k ~ N(μ_k, Σ_k) and G takes z to generate samples; in the dynamic variant, apply the re-parameterization trick to backpropagate through the sampling.
Provide a supervised GM-GAN variant where the discriminator outputs a vector over N classes and the generator maps Gaussian indices to class labels via a function f: [K] -> [N].
Describe two loss adaptations: the standard GAN losses for G and D (equations 2 and 3) and the supervised loss variants (for both generator and discriminator).
Outline the training algorithm (Algorithm 1) including initialization of Gaussian components, sampling from the mixture, and alternating updates of D and G with Adam.

实验结果

研究问题

RQ1Can a multi-modal latent-space prior (Gaussian mixture) improve GAN performance on diverse datasets compared to standard unimodal priors?
RQ2Does the GM-GAN enable effective unsupervised clustering by mapping latent Gaussians to distinct data-space classes?
RQ3How does the number of Gaussians K affect sample quality and diversity across datasets?
RQ4Can a supervised GM-GAN provide better class-conditioned generation than existing conditional GANs?
RQ5Can a post-training adjustment of latent-space covariance control the quality-diversity trade-off?

主要发现

Dataset	Model	Score
CIFAR-10	GAN	5.71 (±0.06)
CIFAR-10	GM-GAN (k=10)	5.92 (±0.07)
CIFAR-10	GM-GAN (k=20)	5.91 (±0.05)
CIFAR-10	GM-GAN (k=30)	5.98 (±0.05)
STL-10	GAN	6.80 (±0.07)
STL-10	GM-GAN (k=10)	7.06 (±0.11)
STL-10	GM-GAN (k=20)	6.58 (±0.16)
STL-10	GM-GAN (k=30)	7.03 (±0.10)
CIFAR-10	AC-GAN	6.23 (±0.07)
CIFAR-10	GM-GAN (k=10)	6.84 (±0.03)
CIFAR-10	GM-GAN (k=20)	6.81 (±0.04)
CIFAR-10	GM-GAN (k=30)	6.83 (±0.02)
STL-10	AC-GAN	7.45 (±0.10)
STL-10	GM-GAN (k=10)	8.32 (±0.06)
STL-10	GM-GAN (k=20)	8.16 (±0.05)
STL-10	GM-GAN (k=30)	8.08 (±0.07)

GM-GANs outperform baselines on both synthetic and real datasets under Inception Score and the proposed quality-diversity metrics.
On CIFAR-10 and STL-10, unsupervised GM-GANs achieve higher Inception Scores than GAN baselines; for STL-10, GM-GAN with K=30 reaches 7.03 vs GAN 6.80.
In supervised settings, GM-GAN consistently improves over AC-GAN across tested K values on CIFAR-10 and STL-10 (e.g., CIFAR-10: AC-GAN 6.23 vs GM-GAN k=10 6.84; STL-10: AC-GAN 7.45 vs GM-GAN k=10 8.32).
The number of Gaussians K can improve or degrade performance depending on the dataset.
GM-GANs converge faster than classical GANs in their toy experiments.
The latent-Gaussian organized structure provides a pathway to unsupervised clustering by associating Gaussians with data-space classes.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。