Skip to main content
QUICK REVIEW

[论文解读] Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Zhengwei Wang, Qi She|arXiv (Cornell University)|Jun 4, 2019
Generative Adversarial Networks and Image Synthesis参考文献 154被引用 275
一句话总结

本文综述了用于计算机视觉的 GAN 变体,将它们归类为结构变体族和损失变体族,并分析在高质量、更多样性以及稳定训练的图像生成方面的进展。

ABSTRACT

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.

研究动机与目标

  • 评估 GAN 在解决计算机视觉领域高质量图像生成、图像多样性和稳定训练方面的进展。
  • 基于架构变化和损失函数设计提供 GAN 变体的分类法。
  • 批判性分析架构变异 GAN 和损失变异 GAN 及其在现实世界计算机视觉应用中的适用性。
  • 总结在计算机视觉领域的显著应用并讨论 GANs 的未来研究方向。

提出的方法

  • 将 GAN 变体分为两大类:架构变体与损失变体。
  • 在架构变体中,按网络架构、潜在空间和应用焦点进行组织。
  • 在损失变体中,按损失类型(IPM-based 与非 IPM-based)及正则化进行分类。
  • 回顾并比较具有代表性的 GAN(例如 CGAN、InfoGAN、AC-GAN、LAPGAN、DCGAN、PROGAN、SAGAN、BigGAN)在图像质量、多样性和训练稳定性方面的表现。
  • 讨论评估指标并就为特定视觉任务选择 GAN 变体提供指南。

实验结果

研究问题

  • RQ1在计算机视觉中,哪些主要的架构方向和损失函数方向已提升了 GAN 的性能?
  • RQ2架构变体 GAN 与损失变体 GAN 在图像质量、多样性和训练稳定性方面有何差异?
  • RQ3哪些 GAN 在高分辨率图像生成和视觉任务的多样性输出方面最为有效?
  • RQ4哪些未来研究方向可以解决计算机视觉领域 GAN 的现实挑战?

主要发现

  • GAN 的进展通过三个核心挑战来分析:高质量图像生成、生成多样性以及稳定训练。
  • 提出双向分类法:架构变体 GAN 与损失变体 GAN,并为每一类提供详细的子类别。
  • 架构变体包括网络架构变化、潜在空间改动以及面向应用的设计(如 PROGAN、CGAN、LAPGAN、SAGAN、BigGAN)。
  • 损失变体涵盖损失函数设计(基于 IPM 与非 IPM)以及用于稳定训练的正则化技术。
  • 该综述讨论了在计算机视觉中的实际应用,并就主要 GAN 家族的优点与局限性进行了批判性分析。
  • 在比较变体时,讨论了如 Inception Score 和 FID 等评估指标。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。