QUICK REVIEW

[论文解读] Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect

Wei Xiang, Boqing Gong|arXiv (Cornell University)|Mar 5, 2018

Generative Adversarial Networks and Image Synthesis被引用 141

一句话总结

本文提出 CT-GAN，一种对 WGAN 的 Lipschitz 一致性正则化，它在梯度惩罚上增加一个对真实数据流形的一致性项，提升图像保真度并在有限标签下实现强的半监督学习结果。

ABSTRACT

Despite being impactful on a variety of problems and applications, the generative adversarial nets (GANs) are remarkably difficult to train. This issue is formally analyzed by \\cite{arjovsky2017towards}, who also propose an alternative direction to avoid the caveats in the minmax two-player training of GANs. The corresponding algorithm, called Wasserstein GAN (WGAN), hinges on the 1-Lipschitz continuity of the discriminator. In this paper, we propose a novel approach to enforcing the Lipschitz continuity in the training procedure of WGANs. Our approach seamlessly connects WGAN with one of the recent semi-supervised learning methods. As a result, it gives rise to not only better photo-realistic samples than the previous methods but also state-of-the-art semi-supervised learning results. In particular, our approach gives rise to the inception score of more than 5.0 with only 1,000 CIFAR-10 images and is the first that exceeds the accuracy of 90% on the CIFAR-10 dataset using only 4,000 labeled images, to the best of our knowledge.

研究动机与目标

通过更有效地强制 Lipschitz 连续性来解决 Wasserstein GAN（WGAN）的训练不稳定性。
提出一个一致性项，在除梯度惩罚外对真实数据流形上的 Lipschitz 连续性进行约束。
证明在 CIFAR-10 和 MNIST 上实现更逼真的图像样本生成以及强的半监督学习性能。
展示在小数据情形下的数据效率和减少过拟合。
提供一个与半监督基于 GAN 的学习无缝集成的框架。

提出的方法

用来自 Lipschitz 连续性的软一致性项 CT 来增强改进的 WGAN 目标，以惩罚对 Lipschitz 界的违反。
通过在判别器中引入 dropout 产生的虚拟点，对真实数据点进行扰动，引入局部 Lipschitz 约束的估计。
在真实数据与生成数据之间的插值样本上添加梯度惩罚项 GP，与 CT 结合构成整体损失。
将判别器目标函数公式化为 L = E_z[D(G(z))] − E_x[D(x)] + λ1 GP|ẑ + λ2 CT|x′,x″，其中 CT 在附近真实数据邻域上的扰动判别输出之间强制一致性。
通过将判别器改为 K+1 输出并在 SSL 目标中加入时序集成风格的一致性项 CT，将该方法与半监督学习相连接。
提供训练细节，包括在实验中使用的超参数（例如 λ1 = 10，λ2 = 2）和 M′ 设置（0 到 0.2）。

实验结果

研究问题

RQ1通过一致性项在真实数据流形上强制 Lipschitz 连续性，是否能提高 WGAN 的训练稳定性和样本质量？
RQ2所提出的 CT 项能否补充梯度惩罚，从而在有限标签数据下获得更好的半监督学习性能？
RQ3在标准基准（MNIST、CIFAR-10）上，CT-GAN 在无监督和半监督设置下与先前的基于 GAN 方法相比的表现如何？
RQ4该方法是否在低数据情形下减少过拟合并保持数据效率？

主要发现

CT-GAN 在 CIFAR-10 和 MNIST 上产生的样本比 GP-WGAN 更具照片级真实感。
该方法显示出较少的过拟合，在 GP-WGAN 已饱和的测试数据上仍在持续改进。
CT-GAN 在 CIFAR-10 的无监督和半监督设置下均取得了最先进的 inception 分数（例如，在先前的基于 GAN 的结果之上）。
在仅有 4,000 个带标签图像的半监督 CIFAR-10 上，CT-GAN 实现 9.98% 的测试错误率，优于若干竞争的基于 GAN 的 SSL 方法。
在 MNIST 上，CT-GAN 在半监督测试误差方面具竞争力（0.89% ± 0.13），相较于其他方法。
定性结果显示，与 GP-WGAN 相比，在多种网络结构下样本更干净、更连贯。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。