QUICK REVIEW

[论文解读] cGANs with Projection Discriminator

Takeru Miyato, Masanori Koyama|arXiv (Cornell University)|Feb 15, 2018

Image Processing Techniques and Applications参考文献 23被引用 411

一句话总结

本文提出一种基于投影的判别器用于条件GAN（cGANs），用嵌入标签与特征之间的内积交互替代简单拼接，在 ImageNet 上实现了 SOTA，并在超分辨率任务中取得提升。

ABSTRACT

We propose a novel, projection based way to incorporate the conditional information into the discriminator of GANs that respects the role of the conditional information in the underlining probabilistic model. This approach is in contrast with most frameworks of conditional GANs used in application today, which use the conditional information by concatenating the (embedded) conditional vector to the feature vectors. With this modification, we were able to significantly improve the quality of the class conditional image generation on ILSVRC2012 (ImageNet) 1000-class image dataset from the current state-of-the-art result, and we achieved this with a single pair of a discriminator and a generator. We were also able to extend the application to super-resolution and succeeded in producing highly discriminative super-resolution images. This new structure also enabled high quality category transformation based on parametric functional transformation of conditional batch normalization layers in the generator.

研究动机与目标

提出一种判别器设计，尊重条件信息的概率结构。
提出一种基于投影的条件标签与特征表示之间的交互。
展示在 ImageNet 的类别条件图像生成和图像超分辨率方面质量的提升。
展示如类别形态变换和与条件批归一化的兼容性等能力。

提出的方法

从概率对数似然比推导出投影判别器形式 f(x,y;θ)=y^T V φ(x;θΦ) + ψ(φ(x;θΦ))。
用与嵌入矩阵 V 的内积交互替代将 y 与 x 或特征简单拼接。
使用基于 ResNet 的判别器与生成器，采用谱归一化和 hinge 损失进行训练。
在生成器中应用条件批归一化以实现类别形态变换。
在 ImageNet（1000 类）上进行类别条件生成和在超分辨率任务上进行评估，并与 concatenation 与 AC-GANs 进行对比。

实验结果

研究问题

RQ1使用内积条件的投影判别器是否能比拼接提高条件图像生成质量？
RQ2投影方法能否有效扩展到超分辨率并在生成器中实现类别形态变换？
RQ3在大规模多类别数据集上，投影判别器相对于 AC-GANs 和拼接的表现如何？
RQ4使用投影而非拼接时对多样性和模式覆盖（通过类内 FID 测量）的影响如何？

主要发现

基于投影的判别器在 ImageNet 上的 Inception 评分高于拼接和 AC-GANs（AC-GANs: 28.5 ± .20；concat: 21.1 ± .35；projection: 29.7 ± .61；projection 在 850K 次迭代时：36.8 ± .44）。
投影在类内 FID 方面低于 AC-GANs 和拼接（AC-GANs: 260.0；concat: 141.2；projection: 103.1；projection 850K: 92.4）。
在 CIFAR-10/100 上，投影方法也优于其他条件化方法（ Appendix A 详述）。
在超分辨率方面，投影在 Inception 评估分数（35.2）和 MS-SSIM（0.878）方面高于双三次、双线性和拼接基线；10 种种子集合进一步将 Inception 分数提升至 36.4。
投影通过对条件批归一化参数的插值实现类别形态变换，产生有意义的中间类别。
与 AC-GANs 相比，投影模型避免模式崩溃并在生成样本之间保持多样性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。