QUICK REVIEW

[論文レビュー] Diffusion-GAN: Training GANs with Diffusion

Zhendong Wang, Huangjie Zheng|arXiv (Cornell University)|Jun 5, 2022

Generative Adversarial Networks and Image Synthesis被引用数 65

ひとこと要約

Diffusion-GAN は、適応的前方拡散プロセスを介してガウス混合インスタンスノイズを注入することによりGANを訓練し、タイムステップ依存の識別器を用いて安定した訓練と画像忠実度および多様性の改善を実現します。

ABSTRACT

Generative adversarial networks (GANs) are challenging to train stably, and a promising remedy of injecting instance noise into the discriminator input has not been very effective in practice. In this paper, we propose Diffusion-GAN, a novel GAN framework that leverages a forward diffusion chain to generate Gaussian-mixture distributed instance noise. Diffusion-GAN consists of three components, including an adaptive diffusion process, a diffusion timestep-dependent discriminator, and a generator. Both the observed and generated data are diffused by the same adaptive diffusion process. At each diffusion timestep, there is a different noise-to-data ratio and the timestep-dependent discriminator learns to distinguish the diffused real data from the diffused generated data. The generator learns from the discriminator's feedback by backpropagating through the forward diffusion chain, whose length is adaptively adjusted to balance the noise and data levels. We theoretically show that the discriminator's timestep-dependent strategy gives consistent and helpful guidance to the generator, enabling it to match the true data distribution. We demonstrate the advantages of Diffusion-GAN over strong GAN baselines on various datasets, showing that it can produce more realistic images with higher stability and data efficiency than state-of-the-art GANs.

研究の動機と目的

拡散ベースのインスタンスノイズを導入してGAN訓練の不安定性とモード崩壊に対処・動機づける。
実データと生成データに対してガウス混合ノイズを提供する微分可能な前方拡散メカニズムを提案する。
拡散タイムステップにまたがって動作する識別器と、ノイズとデータ忠実度のバランスを取る適応拡散スケジュールを開発する。
学習ダイナミクスを理論的に分析し、識別器から生成器への一貫した指針を示す。
多様なデータセットと設定で、強力な GAN ベースラインより忠実度と多様性の改善を実証的に示す。

提案手法

実データと生成データの両方に対してガウス混合ノイズを生成する前方拡散チェーンを使用する。
拡散ステップ依存の識別器 D(y, t) を定義し、拡散データ上で動作させる。
識別器から前方拡散チェーンを介して勾配を逆伝播させ、生成器 G を更新する。
拡散された実データと生成データの分布を整合させ、JS-ジンダーヴァージェンスに関連する微分可能なミンマックス目的関数を定式化する。
ノイズとデータ忠実度のバランスを取る自己ペースド・スケジュールを備えた適応拡散を採用する。
ドメイン非依存の拡張を提供し、リークのない拡張保証を示す。

実験結果

リサーチクエスチョン

RQ1拡散ベースのインスタンスノイズはGAN訓練を安定化し、識別器の過学習を防ぐことができるか？
RQ2Diffusion-GAN は多様なデータセットを横断して強力な GAN ベースラインより画像忠実度と多様性（FIDとRecall）を改善するか？
RQ3拡散ベースの拡張はドメイン非依存で、データ効率の高いGAN訓練に有益か？
RQ4適応拡散スケジュールは生成器への信頼できる、勾配豊富な指針を提供するか？

主な発見

Diffusion-GAN は StyleGAN2、ProjectedGAN、InsGen などの最先端ベースラインに対して、複数のデータセットで安定性と生成品質を改善する。
拡散ベースの拡張は、ベースラインと比較してRecall（多様性）を高く、しばしばFID（忠実度）も改善し、データ効率の利点を持つ。
高次元データと低次元データの両方、ドメイン非依存の特徴空間を含む領域で有効である。
適応拡散（Tとtの変化）は識別器を過負荷にさせず、学習に有用な信号を維持する。
限られたデータ条件でも、Diffusion-GAN は強力なデータ効率 Baselines を上回る生成品質を実現する。
メモリと訓練時間コストは基盤となるGANと同程度で、状況によってはオーバーヘッドが削減される。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。