QUICK REVIEW

[논문 리뷰] AEPecker: L0 Adversarial Examples are not Strong Enough

Fei Zuo, Bokai Yang|arXiv (Cornell University)|2018. 12. 23.

Adversarial Robustness in Machine Learning인용 수 2

한 줄 요약

이 논문은 L0 적대적 예제를 탐지하고 수정하기 위해 그 본질적 한계인 소수의 픽셀에 대한 고진폭 변형을 활용하는 새로운 방어 시스템 AEPECKER를 제안한다. Siamese 네트워크를 사용하여 입력 이미지의 사전 처리된 버전과 원본을 비교함으로써 정확한 탐지와 인painting 기반 수정을 가능하게 하여 높은 탐지 정확도와 효과적인 분류 복구를 달성한다.

ABSTRACT

Despite the great achievements made by neural networks on tasks such as image classification, they are brittle and vulnerable to adversarial example (AE) attacks, which are crafted by adding human-imperceptible perturbations to inputs in order that a neural-network-based classifier incorrectly labels them. In particular, L0 AEs are a category of widely discussed threats where adversaries are restricted in the number of pixels that they can corrupt. However, our observation is that, while L0 attacks modify as few pixels as possible, they tend to cause large-amplitude perturbations to the modified pixels. We consider this as an inherent limitation of L0 AEs, and thwart such attacks by both detecting and rectifying them. The main novelty of the proposed detector is that we convert the AE detection problem into a comparison problem by exploiting the inherent limitation of L0 attacks. More concretely, given an image I, it is pre-processed to obtain another image I' . A Siamese network, which is known to be effective in comparison, takes I and I' as the input pair to determine whether I is an AE. A trained Siamese network automatically and precisely captures the discrepancies between I and I' to detect L0 perturbations. In addition, we show that the pre-processing technique, inpainting, used for detection can also work as an effective defense, which has a high probability of removing the adversarial influence of L0 perturbations. Thus, our system, called AEPECKER, demonstrates not only high AE detection accuracies, but also a notable capability to correct the classification results.

연구 동기 및 목표

신경망이 소수의 픽셀에 고진폭 변형을 가진 L0 적대적 예제에 취약한 데에 대응하기 위해.
L0 공격의 근본적 한계인 높은 진폭의 픽셀 변화를 탐지 가능한 서명으로 식별하기 위해.
이러한 한계를 이미지 비교를 통해 활용하는 탐지 기반을 개발하기 위해.
인painting을 통해 탐지뿐만 아니라 적대적 입력을 수정하는 방어 체계를 설계하기 위해.
적대적 예제에서 높은 정확도의 탐지와 신뢰할 수 있는 분류 복구를 달성하기 위해.

제안 방법

원본 이미지 I의 구조적 자연스러움을 유지하는 인painting 기법을 사용해 I'을 생성하기 위해 입력 이미지를 사전 처리한다.
I와 I'을 쌍으로 하여 Siamese 신경망에 입력하여 유사도를 비교하고 적대적 변형을 탐지한다.
Siamese 네트워크가 이미지 비교를 위한 분류 가능한 특징을 학습할 수 있도록 하여 L0 공격로 인한 차이를 식별한다.
같은 인painting 사전 처리 단계를 사용하여 손상된 픽셀을 복원함으로써 적대적 변형을 제거하는 방어 수단으로 활용한다.
I–I' 비교 기반으로 청소년 이미지와 L0 적대적 예제를 구분할 수 있도록 Siamese 네트워크를 엔드 투 엔드로 훈련시킨다.
탐지와 수정을 통합한 유일한 시스템인 AEPECKER로 통합하여 높은 탐지 정확도와 분류 복구 능력을 동시에 확보한다.

실험 결과

연구 질문

RQ1L0 적대적 예제에 내재된 고진폭 변형을 탐지 가능한 서명으로 활용할 수 있는가?
RQ2원본 이미지와 사전 처리된 버전을 비교하는 Siamese 네트워크가 L0 적대적 예제를 효과적으로 탐지할 수 있는가?
RQ3탐지에 사용된 인painting 기반 사전 처리 단계가 방어 수단으로도 효과적으로 기능하는가?
RQ4시스템은 L0 적대적 예제로 인한 잘못된 예측을 높은 신뢰도로 수정할 수 있는가?
RQ5기존 방어 방법과 비교해 AEPECKER는 탐지 및 수정 성능에서 어떤가?

주요 결과

Siamese 네트워크 기반 탐지기는 L0 적대적 예제를 고진폭 특성을 활용해 높은 정확도로 식별한다.
탐지에 사용된 사전 처리 단계인 이미지 인painting이 적대적 변형을 효과적으로 제거하여 강력한 방어 수단이 된다.
AEPECKER는 분류 결과를 복구하는 데 뛰어난 능력을 보이며 많은 경우 모델 예측을 올바른 레이블로 복원한다.
이 방법은 L0 공격의 구조적 한계(소수의 픽셀에 고진폭 변형)를 탐지 신호로 효과적으로 활용한다.
대상 분류기의 재학습 없이도 높은 탐지 정확도와 L0 공격에 대한 강건성을 확보한다.
탐지와 수정을 하나의 프레임워크에 통합함으로써 실용적인 방어 유용성을 향상시킨다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.