QUICK REVIEW

[논문 리뷰] cGANs with Projection Discriminator

Takeru Miyato, Masanori Koyama|arXiv (Cornell University)|2018. 02. 15.

Image Processing Techniques and Applications참고 문헌 23인용 수 411

한 줄 요약

이 논문은 조건부 GAN(cGAN)에 대한 프로젝션 기반 판별기를 도입하고, 간단한 연결(concatenation)을 임베딩된 레이블과 특징 간의 내적 상호작용으로 대체하여 ImageNet에서 최첨단 성능과 초해상도 작업의 향상을 달성한다.

ABSTRACT

We propose a novel, projection based way to incorporate the conditional information into the discriminator of GANs that respects the role of the conditional information in the underlining probabilistic model. This approach is in contrast with most frameworks of conditional GANs used in application today, which use the conditional information by concatenating the (embedded) conditional vector to the feature vectors. With this modification, we were able to significantly improve the quality of the class conditional image generation on ILSVRC2012 (ImageNet) 1000-class image dataset from the current state-of-the-art result, and we achieved this with a single pair of a discriminator and a generator. We were also able to extend the application to super-resolution and succeeded in producing highly discriminative super-resolution images. This new structure also enabled high quality category transformation based on parametric functional transformation of conditional batch normalization layers in the generator.

연구 동기 및 목표

조건부 정보의 확률적 구조를 존중하는 판별기 설계를 동기부여한다.
조건 라벨과 특징 표현 간의 프로젝션 기반 상호작용을 제안한다.
ImageNet의 클래스 조건부 이미지 생성과 이미지 초해상도에서 향상된 품질을 입증한다.
카테고리 모핑과 조건부 배치 정규화와의 호환성과 같은 기능을 선보인다.

제안 방법

확률적 로그-가능도 비율에서 f(x,y;θ)=y^T V φ(x;θΦ) + ψ(φ(x;θΦ))의 프로젝션 판별기 형태를 도출한다.
y를 x나 피처와의 간단한 연결 대신 임베딩 행렬 V와의 내적 상호작용으로 대체한다.
스펙트럼 정규화와 힌지 손실을 사용한 학습으로 ResNet 기반의 판별기와 생성기를 사용한다.
생성기에서 조건부 배치 정규화를 적용하여 카테고리 모핑을 가능하게 한다.
이미지넷(1000-클래스)에서 클래스 조건부 생성을 평가하고 초해상도 작업에서 연결(concatenation) 및 AC-GAN과 비교한다.

실험 결과

연구 질문

RQ1내적 상호작용 조건부를 사용하는 프로젝션 기반 판별기가 연결 방식(concatenation)과 비교해 조건부 이미지 생성 품질을 향상시키는가?
RQ2프로젝션 방식이 초해상도에 효과적으로 확장되고 생성기 내에서 카테고리 모핑을 가능하게 할 수 있는가?
RQ3대규모 다중 클래스 데이터세트에서 프로젝션 판별기가 AC-GANs 및 concatenation에 비해 어떤 성능을 보이는가?
RQ4프로젝션 대 연결 사용 시 다양성과 모드 커버리지(in intra-class FID로 측정)에 미치는 영향은 무엇인가?

주요 결과

프로젝션 기반 판별기는 ImageNet에서 연결 및 AC-GANs보다 더 높은 Inception Score를 보인다(AC-GANs: 28.5 ± .20; concat: 21.1 ± .35; projection: 29.7 ± .61; projection 850K iterations: 36.8 ± .44).
프로젝션은 AC-GANs와 concatenation보다 낮은 intra-class FID를 달성한다(AC-GANs: 260.0; concat: 141.2; projection: 103.1; projection 850K: 92.4).
CIFAR-10/100에서 프로젝션 방법은 대안적 조건 부여 방식보다 우수한 성능을 보였다(부록 A의 세부사항).
초해상도에서 프로젝션은 Bicubic, Bilinear, concatenation 기반보다 높은 Inception Accuracy(35.2)와 MS-SSIM(0.878)을 보이며; 10-시드 앤섬블은 Inception Accuracy를 36.4로 더 올려준다.
프로젝션은 조건부 배치 정규화 매개변수의 보간을 통해 카테고리 모핑을 가능하게 하여 의미 있는 중간 클래스를 생성한다.
AC-GANs와 비교하여 프로젝션 모델은 모드 붕괴를 피하고 생성 샘플 전반에 걸친 다양성을 유지한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.