QUICK REVIEW

[논문 리뷰] Leaving Reality to Imagination: Robust Classification via Generated Datasets

Hritik Bansal, Aditya Grover|arXiv (Cornell University)|2023. 02. 05.

Generative Adversarial Networks and Image Synthesis인용 수 14

한 줄 요약

Stable Diffusion에서 생성된 데이터로 보강한 실제 데이터로 ImageNet 분류기를 학습하면 정확도와 자연스러운 분포 변화에 대한 유효 로버스트니스가 향상되어 표준 증강만 사용하는 경우를 능가합니다.

ABSTRACT

Recent research on robustness has revealed significant performance gaps between neural image classifiers trained on datasets that are similar to the test set, and those that are from a naturally shifted distribution, such as sketches, paintings, and animations of the object categories observed during training. Prior work focuses on reducing this gap by designing engineered augmentations of training data or through unsupervised pretraining of a single large model on massive in-the-wild training datasets scraped from the Internet. However, the notion of a dataset is also undergoing a paradigm shift in recent years. With drastic improvements in the quality, ease-of-use, and access to modern generative models, generated data is pervading the web. In this light, we study the question: How do these generated datasets influence the natural robustness of image classifiers? We find that Imagenet classifiers trained on real data augmented with generated data achieve higher accuracy and effective robustness than standard training and popular augmentation strategies in the presence of natural distribution shifts. We analyze various factors influencing these results, including the choice of conditioning strategies and the amount of generated data. Additionally, we find that the standard ImageNet classifiers suffer a performance degradation of upto 20\% on the generated data, indicating their fragility at accurately classifying the objects under novel variations. Lastly, we demonstrate that the image classifiers, which have been trained on real data augmented with generated data from the base generative model, exhibit greater resilience to natural distribution shifts compared to the classifiers trained on real data augmented with generated data from the finetuned generative model on the real data. The code, models, and datasets are available at https://github.com/Hritikbansal/generative-robustness.

연구 동기 및 목표

자연적으로 이동된 데이터셋(예: 스케치, 렌더링)으로 평가할 때 로버스트니스의 차이를 유발하는 원인 제시.
현실 세계의 최신 생성 모델에서 생성된 데이터가 분류기 로버스트니스에 미치는 영향을 조사.
ImageNet 및 자연 분포 시프트 데이터셋에서 실제 데이터, 생성 데이터, 그리고 혼합 데이터로의 학습을 평가.
로버스트니스와 정확도에 대한 조건화 전략, 데이터셋 크기 효과, 생성 템플릿 분석.
재현 가능한 벤치마킹을 위해 공개 코드, 모델 및 데이터셋 제공.

제안 방법

다양한 텍스트 템플릿을 통해 이미지넷 클래스 라벨로 조건화된 Stable Diffusion을 사용하여 대규모 합성 데이터셋(1.3M 이미지)을 생성.
합성 데이터셋으로 실제 ImageNet-1K 학습 데이터를 보강하고 처음부터 분류기를 학습.
자연 분포 시프트 데이터셋(ImageNet-Sketch, ImageNet-R, ImageNet-V2, ObjectNet)에서 평가하고 실제만 학습, 생성만 학습과 비교.
다양성 및 효과를 평가하기 위해 제로샷 생성과 핸드크래프트 증강, 잠재 확산 모델을 비교.
데이터 생성 전략(템플릿, 실제 이미지, 혼합)과 정확도 및 효과적 로버스트니스(ER)에 미치는 영향을 분석.
기본/생성 데이터셋과 코드를 공개하여 로버스트니스 벤치마킹의 기준선을 제공.

실험 결과

연구 질문

RQ1실제 데이터와 생성 데이터를 혼합하는 것이 자연 분포 시프트 데이터셋에서 정확도와 효과적 로버스트니스를 향상시키나요?
RQ2생성 전략(텍스트 템플릿, 실제 이미지 조건화, 또는 혼합)이 로버스트니스와 정확도에 어떤 영향을 줍니까?
RQ3실제 데이터에 비해 생성 데이터의 크기가 ER과 정확도에 미치는 영향은 무엇입니까?
RQ4생성 모델을 실제 데이터로 파인튜닝하는 것이 이동 분포에서의 로버스트니스에 어떤 영향을 줍니까?
RQ5합성 시프트에서 기존 ImageNet 분류기를 벤치마킹하는 데 생성 데이터셋이 효과적입니까?

주요 결과

모델	Im-Sketch	Im-R	Im-V2	ObjectNet	평균
생성 데이터	37.8	45.3	9.1	49.9	35.6
실제 + 생성 데이터	14.9	16.7	0.5	2.3	8.6

실제 데이터에 생성 데이터를 보강하여 학습하면 실제만 학습이나 생성만 학습에 비해 절대 정확도가 같거나 더 높고 자연 분포 시프트 데이터셋에서 로버스트한 성능이 더 큽니다.
생성 데이터만으로는 효과적 로버스트니스가 증가하지만 절대 정확도는 종종 낮아지며, 실제+생성 혼합이 유리한 균형을 제공합니다.
다양한 템플릿을 사용한 클래스-라벨 기반 생성이 단일 템플릿 프롬프트보다 우수하며 로버스트니스 학습 전략 중 거의 최상에 근접합니다.
표준 ImageNet 분류기는 생성 데이터에서 최대 약 20%까지 저하를 보이며 새로운 변형에 취약함을 나타내지만 혼합 학습은 이 격차를 줄입니다.
제로샷 CLIP 조건화는 생성기의 도메인 적응 없이도 강한 로버스트니스를 제공하며, 실제 데이터와의 혼합은 실제/생성 데이터에서 높은 정확도를 달성할 수 있습니다.
생성기를 실제 데이터로 파인튜닝하면 분포 간격은 줄어들지만 증강에 기본 생성 데이터를 사용하는 것만큼 정확도를 항상 향상시키지는 않습니다.
생성 데이터 크기를 늘리면 일반적으로 ER이 증가하여, 계산 예산 하에서 더 큰 합성 데이터가 로버스트니스를 향상시킬 수 있음을 시사합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.