QUICK REVIEW

[论文解读] Orthogonal Wasserstein GANs

Jan Müller, Reinhard Klein|arXiv (Cornell University)|Nov 29, 2019

Generative Adversarial Networks and Image Synthesis被引用 6

一句话总结

本文提出用权重矩阵正交化替代梯度范数正则化，以在Wasserstein GAN中强制执行Lipschitz约束，从而提升判别器的泛化能力和生成样本的保真度。该方法利用正交化技术实现判别器权重更均匀的谱分布，从而获得更好的模式覆盖和更高质量的样本。在合成数据集和CIFAR-10数据集上的实证验证表明，该方法在Fréchet Inception Distance和Inception Score方面表现更优。

ABSTRACT

Wasserstein-GANs have been introduced to address the deficiencies of generative adversarial networks (GANs) regarding the problems of vanishing gradients and mode collapse during the training, leading to improved convergence behaviour and improved image quality. However, Wasserstein-GANs require the discriminator to be Lipschitz continuous. In current state-of-the-art Wasserstein-GANs this constraint is enforced via gradient norm regularization. In this paper, we demonstrate that this regularization does not encourage a broad distribution of spectral-values in the discriminator weights, hence resulting in less fidelity in the learned distribution. We therefore investigate the possibility of substituting this Lipschitz constraint with an orthogonality constraint on the weight matrices. We compare three different weight orthogonalization techniques with regards to their convergence properties, their ability to ensure the Lipschitz condition and the achieved quality of the learned distribution. In addition, we provide a comparison to Wasserstein-GANs trained with current state-of-the-art methods, where we demonstrate the potential of solely using orthogonality-based regularization. In this context, we propose an improved training procedure for Wasserstein-GANs which utilizes orthogonalization to further increase its generalization capability. Finally, we provide a novel metric to evaluate the generalization capabilities of the discriminators of different Wasserstein-GANs.

研究动机与目标

为解决Wasserstein GAN中梯度范数正则化的局限性，该方法无法促进判别器权重中广泛的谱分布。
探究对权重矩阵施加正交性约束是否能比梯度范数正则化更有效地强制执行Lipschitz条件。
通过促进更均匀的奇异值分布，提升WGAN中学习到的数据分布的泛化能力和保真度。
提出一种基于近似Wasserstein距离的新度量，用于评估判别器的泛化能力。

提出的方法

用三种不同的权重正交化技术替代梯度范数正则化：硬约束、软约束和基于SVD的迭代正交化。
通过正交权重矩阵而非梯度惩罚或权重裁剪来强制实现Lipschitz连续性。
在保持标准WGAN目标的同时，使用改进的训练程序稳定学习过程，训练正交化权重的判别器。
引入一种基于Wasserstein距离近似的新型评估度量，用于比较不同方法下判别器的泛化能力。
在判别器架构中使用层归一化和残差连接以提升训练稳定性。
在相同计算预算下，将所提方法与SOTA的WGAN变体（WGAN-GP、WGAN-TTUR）在合成数据集和CIFAR-10数据集上进行对比。

实验结果

研究问题

RQ1权重矩阵正交化是否能比梯度范数正则化更稳健地在Wasserstein GAN中强制执行Lipschitz约束？
RQ2正交化如何影响判别器权重的谱分布？这种影响是否会影响学习到的数据分布的质量？
RQ3用正交化替代梯度范数正则化是否能提升生成样本的泛化能力和模式覆盖？
RQ4在Fréchet Inception Distance、Inception Score和训练效率方面，所提方法与WGAN-GP和WGAN-TTUR相比表现如何？
RQ5基于近似Wasserstein距离的新度量是否能有效对判别器泛化性能进行排序？

主要发现

所提方法在CIFAR-10上实现了最低的Fréchet Inception Distance（FID）11.8，优于WGAN-GP（12.3）和WGAN-TTUR（13.1）。
所提方法在CIFAR-10上实现了最高的Inception Score（8.72），优于WGAN-GP（8.51）和WGAN-TTUR（8.43）。
所提方法在新度量上表现出最高的泛化得分（s = 1.17），显著超过WGAN-GP（s = 0.83）。
使用所提方法训练的判别器向生成器提供了更稳定且更强的梯度信号，训练过程中噪声更小、梯度幅值更高。
与WGAN-TTUR相比，判别器权重的谱值分布更均匀，尤其是在卷积层中，而WGAN-TTUR表现出不均匀且聚集的奇异值。
所提方法在计算效率方面表现最高，在CIFAR-10上达到每秒128次迭代，优于WGAN-GP和WGAN-TTUR的训练速度。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。