[논문 리뷰] Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning
이 논문은 적대적 자체 지도 사전학습(adversarial self-supervised pretraining)과 이어서 적대적 미세조정(adversarial fine-tuning)을 도입하여 로버스트성을 높이고, CIFAR-10에서 상당한 향상을 보이며 여러 자체 지도 사전학습 작업에 걸친 앙상블 전략을 가능하게 한다.
Pretrained models from self-supervision are prevalently used in fine-tuning downstream tasks faster or for better accuracy. However, gaining robustness from pretraining is left unexplored. We introduce adversarial training into self-supervision, to provide general-purpose robust pre-trained models for the first time. We find these robust pre-trained models can benefit the subsequent fine-tuning in two ways: i) boosting final model robustness; ii) saving the computation cost, if proceeding towards adversarial fine-tuning. We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins (eg, 3.83% on robust accuracy and 1.3% on standard accuracy, on the CIFAR-10 dataset), compared with the conventional end-to-end adversarial training baseline. Moreover, we find that different self-supervised pre-trained models have a diverse adversarial vulnerability. It inspires us to ensemble several pretraining tasks, which boosts robustness more. Our ensemble strategy contributes to a further improvement of 3.59% on robust accuracy, while maintaining a slightly higher standard accuracy on CIFAR-10. Our codes are available at https://github.com/TAMU-VITA/Adv-SS-Pretraining.
연구 동기 및 목표
- 적대적 사전학습이 미세조정 동안 하류 로버스트니스를 개선하는지 조사한다.
- 적대적 사전학습과 적대적 미세조정의 로버스트니스 및 효율성을 비교한다.
- 다양한 자체 지도 사전학습 작업이 최종 모델의 로버스트니스에 어떤 영향을 미치는지 평가한다.
- 여러 사전학습 작업 전체를 아우르는 앙상블 전략을 탐구하여 로버스트니스를 향상시킨다.
제안 방법
- Embed adversarial training into multiple self-supervised pretraining tasks (Selfie, Rotation, Jigsaw).
- Fine-tune classifiers on downstream tasks using partial or full parameter re-use from robust pretrained models.
- Propose an ensemble objective with a diversity-promoting regularizer to combine multiple pretraining tasks.
- Evaluate standard vs robust accuracy on CIFAR-10 and CIFAR-10-C under PGD attacks and unseen attacks.
- Compare against one-shot adversarial training regularized by a single self-supervised task.
- Perform ablations on dataset size, input resolution, and defense options.]
- research_questions:
- Q1. Can adversarial pretraining improve robustness of downstream fine-tuning?
- Q2. Which is more effective for robustness and efficiency: adversarial pretraining or adversarial fine-tuning?
- Q3. How does the choice of self-supervised pretraining task affect final robustness?
- Q4. Can ensemble of self-supervised pretraining tasks further boost robustness?
실험 결과
연구 질문
- RQ1Q1. Can adversarial pretraining improve robustness of downstream fine-tuning?
- RQ2Q2. Which is more effective for robustness and efficiency: adversarial pretraining or adversarial fine-tuning?
- RQ3Q3. How does the choice of self-supervised pretraining task affect final robustness?
- RQ4Q4. Can ensemble of self-supervised pretraining tasks further boost robustness?
주요 결과
- Robust pretrained models used for adversarial fine-tuning yield large gains, e.g., 3.83% improvement in robust accuracy and 1.3% in standard accuracy on CIFAR-10.
- Adversarial fine-tuning contributes most to robustness, while robust pretraining mainly speeds up fine-tuning.
- Different self-supervised tasks have diverse adversarial vulnerabilities, motivating ensemble pretraining for robustness gains.
- Ensembling three pretraining tasks with adversarial fine-tuning achieves 54.64% robust accuracy and 86.04% standard accuracy on CIFAR-10 (AT setting).
- Adversarial pretraining plus full adversarial fine-tuning requires fewer epochs than end-to-end AT, indicating improved efficiency.
- Compared to one-shot AT with rotation regularization, the proposed approach improves robustness against unforeseen attacks across multiple trials.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.