QUICK REVIEW

[論文レビュー] cGANs with Projection Discriminator

Takeru Miyato, Masanori Koyama|arXiv (Cornell University)|Feb 15, 2018

Image Processing Techniques and Applications参考文献 23被引用数 411

ひとこと要約

この論文は、条件付き GANs (cGANs) のための投影ベースの識別器を提案し、単純な結合を埋め込みラベルと特徴間の内積相互作用に置換し、ImageNetで最先端の結果を達成し、超解像タスクでの向上を実現している。

ABSTRACT

We propose a novel, projection based way to incorporate the conditional information into the discriminator of GANs that respects the role of the conditional information in the underlining probabilistic model. This approach is in contrast with most frameworks of conditional GANs used in application today, which use the conditional information by concatenating the (embedded) conditional vector to the feature vectors. With this modification, we were able to significantly improve the quality of the class conditional image generation on ILSVRC2012 (ImageNet) 1000-class image dataset from the current state-of-the-art result, and we achieved this with a single pair of a discriminator and a generator. We were also able to extend the application to super-resolution and succeeded in producing highly discriminative super-resolution images. This new structure also enabled high quality category transformation based on parametric functional transformation of conditional batch normalization layers in the generator.

研究の動機と目的

条件付き情報の確率的構造を尊重する識別器設計を動機づける。
条件ラベルと特徴表現間の投影ベースの相互作用を提案する。
ImageNet のクラス条件付き画像生成および画像超解像の品質向上を実証する。
カテゴリ形態変化や条件付きバッチ正規化との互換性などの機能を示す。

提案手法

確率的対数尤度比から f(x,y;θ)=y^T V φ(x;θΦ) + ψ(φ(x;θΦ)) の投影識別器形を導出する。
y を x または特徴と単純に結合する代わりに、埋め込み行列 V との内積相互作用で置き換える。
スペクトral 正規化とヒンジ損失を用いた ResNet ベースの識別器と生成器を訓練に使用する。
生成器に条件付きバッチ正規化を適用してカテゴリ形態変化を可能にする。
ImageNet（1000 クラス）でクラス条件付き生成を評価し、超解像タスクで評価し、結合と AC-GANs と比較する。

実験結果

リサーチクエスチョン

RQ1投影ベースの識別器は、内積条件付けを用いることで、結合による条件付けと比較して条件付き画像生成品質を改善するか。
RQ2投影アプローチは超解像へ効果的に拡張でき、生成器内でカテゴリ形態変化を可能にするか。
RQ3大規模で多クラスのデータセットにおいて、投影識別器は AC-GANs や結合と比べてどのように性能を発揮するか。
RQ4投影と結合の使用による多様性とモードカバレッジ（クラス内の INTRA-FID で測定）にどのような影響があるか。

主な発見

Method	Inception Score	Intra FID
AC-GANs	28.5 ± .20	260.0
concat	21.1 ± .35	141.2
projection	29.7 ± .61	103.1
projection (850K iterations)	36.8 ± .44	92.4

投影ベースの識別器は、ImageNet で結合および AC-GANs より高い Inception Score を示す（AC-GANs: 28.5 ± .20; concat: 21.1 ± .35; projection: 29.7 ± .61; projection at 850K iterations: 36.8 ± .44）。
投影は、AC-GANs および結合より低いクラス内 FID を達成（AC-GANs: 260.0; concat: 141.2; projection: 103.1; projection 850K: 92.4）。
CIFAR-10/100 では、投影法が他の条件付け手法を上回った（ Appendix A の詳細）。
超解像では、投影は Bicubic、Bil near、結合ベースより高い Inception Accuracy（35.2）と MS-SSIM（0.878）を示し、10-シードアンサンブルで Inception Accuracy を 36.4 にさらに向上。
投影は条件付きバッチ正規化パラメータの内挿を介してカテゴリ形態変化を可能にし、有意義な中間クラスを生成する。
AC-GANs と比較して、投影モデルはモード崩壊を回避し、生成サンプル間の多様性を維持する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。