QUICK REVIEW

[论文解读] A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

Jie Gui, Zhenan Sun|arXiv (Cornell University)|Jan 20, 2020

Generative Adversarial Networks and Image Synthesis参考文献 392被引用 260

一句话总结

对生成对抗网络（GANs）的全面综述，详细介绍算法、理论、变体和应用，并阐明模型之间的联系与未尽的研究方向。

ABSTRACT

Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications. Firstly, the motivations, mathematical representations, and structure of most GANs algorithms are introduced in details. Furthermore, GANs have been combined with other machine learning algorithms for specific applications, such as semi-supervised learning, transfer learning, and reinforcement learning. This paper compares the commonalities and differences of these GANs methods. Secondly, theoretical issues related to GANs are investigated. Thirdly, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, medical field, and data science are illustrated. Finally, the future open research problems for GANs are pointed out.

研究动机与目标

解释GAN的动机与结构，并将其置于生成建模的背景中。
总结核心目标函数与训练动态，包括极小极大、非饱和及最大似然观点。
盘点并关联具有代表性的GAN变体和训练策略。
综述将GAN目标与散度（KL、JS、f-散度、IPM）等理论问题及其含义的联系。
展示在图像处理、自然语言处理、音乐、医学和数据科学等领域的典型应用，并勾勒尚待解决的问题。

提出的方法

描述原始的GAN框架及其极小极大目标。
讨论替代目标函数及其理论含义（如 JS/KL 散度、IPM）。
给出具有代表性的GAN变体（InfoGAN、cGAN、CycleGAN、f-GAN、WGAN、LS-GAN 等）及其训练技巧。
解释条件化、辅助任务，以及作为扩展的多GAN/判别器架构。
评审评估、可视化工具，以及与更广泛学习框架的联系。

实验结果

研究问题

RQ1从算法和理论角度，主要GAN变体之间有哪些联系与差异？
RQ2不同散度与距离度量（如 JS、KL、f-散度、IPM（如 WGAN））如何影响GAN训练的稳定性与质量？
RQ3GAN在各领域的主要应用是什么，尚存哪些未解决的问题？
RQ4条件化、循环一致性与辅助损失如何影响生成质量与模式覆盖？

主要发现

GAN训练可以被视为极小极大博弈，判别器引导生成器向真实数据分布靠拢。
原始GAN目标与JS和KL散度相关，将GAN与已有的统计距离联系起来。
非饱和与最大似然的解释在梯度行为和训练稳定性方面提供权衡。
大量GAN变体通过结构和损失函数的改变，解决训练稳定性、模式崩溃、条件化和无配对数据等问题。
基于Wasserstein的方法（WGAN、WGAN-GP）提供更稳定的训练和有意义的损失曲线。
GAN在图像处理、NLP、音乐、语音、医学和数据科学等领域有广泛应用，且有若干针对高分辨率、翻译和领域自适应的特化派生。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。