QUICK REVIEW

[논문 리뷰] Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Zhengwei Wang, Qi She|arXiv (Cornell University)|2019. 06. 04.

Generative Adversarial Networks and Image Synthesis참고 문헌 154인용 수 275

한 줄 요약

이 논문은 GAN 변형을 컴퓨터 비전에 적용한 방향을 조사하고, 이를 아키텍처- 및 손실-변형 계열로 정리하며, 고품질, 다양성, 그리고 안정적으로 학습된 이미지 생성으로의 진전을 분석한다.

ABSTRACT

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on https://github.com/sheqi/GAN_Review.

연구 동기 및 목표

컴퓨터 비전에서 고품질 이미지 생성, 이미지 다양성, 안정적인 학습이라는 GAN의 진전을 평가한다.
아키텍처 변경 및 손실 함수 설계를 기반으로 한 GAN 변형의 분류 체계를 제공한다.
아키텍처-변형 GAN과 손실-변형 GAN을 비판적으로 분석하고, 실제 컴퓨터 비전 응용에의 적합성을 평가한다.
주목할 만한 응용 사례를 요약하고 컴퓨터 비전용 GAN의 향후 연구 방향을 논의한다.

제안 방법

GAN 변형을 두 가지 주요 그룹으로 분류한다: 아키텍처-변형과 손실-변형.
아키텍처-변형 내부에서 네트워크 아키텍처, 잠재 공간, 응용 초점으로 구성한다.
손실-변형 내부에서 손실 유형(IPM 기반 대 비-IPM 기반) 및 정규화로 분류한다.
대표 GAN들(CGAN, InfoGAN, AC-GAN, LAPGAN, DCGAN, PROGAN, SAGAN, BigGAN)을 이미지 품질, 다양성, 학습 안정성의 관점에서 평가하고 비교한다.
평가 지표를 논의하고 특정 비전 작업에 적합한 GAN 변형을 선택하는 데 필요한 가이드를 제시한다.

실험 결과

연구 질문

RQ1컴퓨터 비전에서 GAN 성능을 개선한 주요 아키텍처 방향과 손실 함수 방향은 무엇인가?
RQ2아키텍처-변형 GAN과 손실-변형 GAN은 이미지 품질, 다양성, 학습 안정성 측면에서 어떻게 비교되는가?
RQ3고해상도 이미지 생성 및 다양한 출력에 가장 효과적인 GAN은 어떤 것들이 있는가?
RQ4컴퓨터 비전에서 남아 있는 실용적 문제를 다룰 수 있는 향후 연구 방향은 무엇인가?

주요 결과

GAN 발전은 고품질 이미지 생성, 생성의 다양성, 학습 안정성의 세 가지 핵심 과제를 통해 분석된다.
두 축의 분류 체계가 제안된다: 아키텍처-변형 GAN과 손실-변형 GAN, 각 각에 대한 세부 하위 범주를 포함한다.
아키텍처-변형은 네트워크 아키텍처 변화, 잠재 공간 변화, 응용 중심 설계 등을 포함한다(예: PROGAN, CGAN, LAPGAN, SAGAN, BigGAN).
손실-변형은 손실 함수 설계(IPM 기반 대 비-IPM 기반)와 학습 안정을 위한 정규화 기술을 다룬다.
본 고는 컴퓨터 비전의 실용적 응용을 다루고 주요 GAN 계열의 장점과 한계에 대한 비판적 분석을 제공한다.
Inception Score 및 FID와 같은 평가 지표를 변형 간 비교 맥락에서 논의한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.