QUICK REVIEW

[논문 리뷰] Differentiable Augmentation for Data-Efficient GAN Training

Shengyu Zhao, Zhijian Liu|arXiv (Cornell University)|2020. 06. 18.

Generative Adversarial Networks and Image Synthesis참고 문헌 52인용 수 100

한 줄 요약

DiffAugment는 GAN 학습 중 실제 샘플과 생성 샘플 모두에 미분 가능 증강을 적용하여 데이터 효율성과 수렴을 개선하고, 다양한 아키텍처와 데이터셋에서 강력한 결과를 달성합니다.

ABSTRACT

The performance of generative adversarial networks (GANs) heavily deteriorates given a limited amount of training data. This is mainly because the discriminator is memorizing the exact training set. To combat it, we propose Differentiable Augmentation (DiffAugment), a simple method that improves the data efficiency of GANs by imposing various types of differentiable augmentations on both real and fake samples. Previous attempts to directly augment the training data manipulate the distribution of real images, yielding little benefit; DiffAugment enables us to adopt the differentiable augmentation for the generated samples, effectively stabilizes training, and leads to better convergence. Experiments demonstrate consistent gains of our method over a variety of GAN architectures and loss functions for both unconditional and class-conditional generation. With DiffAugment, we achieve a state-of-the-art FID of 6.80 with an IS of 100.8 on ImageNet 128x128 and 2-4x reductions of FID given 1,000 images on FFHQ and LSUN. Furthermore, with only 20% training data, we can match the top performance on CIFAR-10 and CIFAR-100. Finally, our method can generate high-fidelity images using only 100 images without pre-training, while being on par with existing transfer learning algorithms. Code is available at https://github.com/mit-han-lab/data-efficient-gans.

연구 동기 및 목표

데이터가 부족할 때 GAN 데이터 효율성을 향상시키려는 동기.
대상 데이터 분포를 왜곡하지 않으면서 판별기 과적합을 방지.
증강을 통해 제너레이터로 기울기를 역전파할 수 있게 하여 학습을 안정화.

제안 방법

D 및 G 업데이트 동안 실제 샘플과 가짜 샘플에 동일한 미분 가능 증강 T를 적용.
간단한 증강: 이동(Translation), Cutout, 컬러를 사용하고 그 구성들을 연구.
T가 미분 가능하도록 하여 그래디언트가 G로 역전파될 수 있게 함(그림 4).
실제 데이터만 또는 판별기 입력만 증강하는 경우 분포 이동이나 학습 역학의 불균형으로 실패함을 보여줌.
BigGAN과 StyleGAN2를 ImageNet, CIFAR, FFHQ, LSUN-Cat 및 저샷 설정에서 DiffAugment로 평가.

실험 결과

연구 질문

RQ1제한된 데이터 하에서 실제 샘플과 생성 샘플 모두에 미분 가능 증강을 적용하면 GAN 학습이 안정되는가?
RQ2아키텍처 전반에 걸친 어떤 증강 유형(및 조합)이 데이터 효율성을 가장 잘 향상시키는가?
RQ3다양한 데이터셋 및 데이터 규격에서 DiffAugment는 무조건적 및 조건부 GAN에 어떻게 작동하는가?

주요 결과

DiffAugment를 적용한 BigGAN은 ImageNet 128×128에서 truncation 없이 IS 100.8 및 FID 6.80을 달성.
FFHQ 및 LSUN에서 1k 학습 샘플로 FID를 2–4× 감소.
CIFAR-10/CIFAR-100 데이터의 20%로도 DiffAugment가 최상위 성능에 근접; 사전 학습 없이 강력한 저샷 결과를 달성(100개 샘플).
DiffAugment는 100%, 50%, 25% 데이터 설정에서 StyleGAN2 및 BigGAN의 베이스라인을 일관되게 개선.
강력한 증강 정책이 판별기 과적합을 줄이고 수렴을 개선(그림 6).
DiffAugment는 고정된 증강 정책에서도 고수준의 적응적 증강 방법(ADA)과 비교해 효과적이다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.