QUICK REVIEW

[논문 리뷰] Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks

Jiadong Lin, Chuanbiao Song|arXiv (Cornell University)|2019. 08. 17.

Adversarial Robustness in Machine Learning참고 문헌 32인용 수 229

한 줄 요약

본 논문은 두 가지 기울기 기반 적대적 공격 강화 기법 NI-FGSM과 SIM을 제안하여 모델 간 전이성을 향상시키고, 특히 방어에 대한 견고성을 높이며, ImageNet에서 블랙-박스 공격에 대해 강력한 효과를 입증한다.

ABSTRACT

Deep learning models are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on benign inputs. However, under the black-box setting, most existing adversaries often have a poor transferability to attack other defense models. In this work, from the perspective of regarding the adversarial example generation as an optimization process, we propose two new methods to improve the transferability of adversarial examples, namely Nesterov Iterative Fast Gradient Sign Method (NI-FGSM) and Scale-Invariant attack Method (SIM). NI-FGSM aims to adapt Nesterov accelerated gradient into the iterative attacks so as to effectively look ahead and improve the transferability of adversarial examples. While SIM is based on our discovery on the scale-invariant property of deep learning models, for which we leverage to optimize the adversarial perturbations over the scale copies of the input images so as to avoid "overfitting" on the white-box model being attacked and generate more transferable adversarial examples. NI-FGSM and SIM can be naturally integrated to build a robust gradient-based attack to generate more transferable adversarial examples against the defense models. Empirical results on ImageNet dataset demonstrate that our attack methods exhibit higher transferability and achieve higher attack success rates than state-of-the-art gradient-based attacks.

연구 동기 및 목표

블랙박스 환경에서 전이 가능한 적대적 예제 연구의 동기를 부여한다.
전이성을 높이기 위한 두 가지 새로운 공격 전략 NI-FGSM과 SIM을 도입한다.
NI-FGSM과 SIM의 결합이 강건한 공격(SI-NI-FGSM)을 만들어낸다는 것을 보여준다.
ImageNet에서 일반 학습된 모델 및 적대적 학습된 모델에 대한 우수한 공격 성능을 입증한다.
제안된 공격의 강건함을 강조하기 위해 고급 방어에 대해 평가한다.

제안 방법

L-infinity 노이즈 한계로 제약된 최적화 문제로 적대적 예제 생성을 재정의한다.
Nesterov 가속 기울기(Nesterov accelerated gradient)를 반복적 기울기 기반 공격에 도입하여 NI-FGSM을 제안한다.
깊은 신경망의 스케일 불변성을 활용하고 입력의 스케일 복사본에 대해 손실 최대화를 수행하여 SIM을 개발한다.
NI-FGSM과 SIM을 SI-NI-FGSM으로 결합하고 DIM, TIM, TI-DIM 변형으로 확장하여 추가 이득을 얻는다.
업데이트 중에 스케일-복사 기울기 합산과 Nesterov 보정이 포함된 알고리즘(SI-NI-FGSM)을 제시한다.

실험 결과

연구 질문

RQ1Nesterov 가속 기울기가 기울기 기반 적대적 공격의 전이성을 개선할 수 있는가?
RQ2입력 스케일을 통한 스케일 불변성(SIM)을 활용하는 것이 모델 간 전이성을 향상시키는가?
RQ3NI-FGSM과 SIM의 결합(SI-NI-FGSM)이 일반 방어와 강건한 방어를 아우르는 기존의 기울기 기반 공격보다 더 우수한가?
RQ4SI-NI-FGSM 변형이 ImageNet의 고급 방어 및 적대적 학습 모델에 대해 어떻게 성능을 내는가?

주요 결과

SI-NI-FGSM 및 그 DIM/TIM 확장은 ImageNet에서 블랙박스 전이성에서 일관되게 기준선보다 우수하다.
SI-NI-TI-DIM은 방어를 가리지 않고 약 93.5%의 평균 성공률로 높은 전이성을 달성한다.
SI-NI-FGSM은 일반적으로 MI-FGSM보다 적은 반복으로 동등하거나 더 높은 공격 성공률에 도달하여 더 빠른 생성과 더 나은 전이성을 보인다.
입력의 스케일 불변성은 스케일 간 손실 안정성을 제공하여 여러 모델을 재학습시키지 않고도 효과적인 모델 확장을 가능하게 한다.
스케일 증강된 입력의 앙상블에서 최적화된 공격은 보지 않은 방어에 더 잘 전이되며, 최첨단 기울기 기반 방법을 능가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.