QUICK REVIEW

[논문 리뷰] DARTS: Deceiving Autonomous Cars with Toxic Signs

Chawin Sitawarin, Arjun Nitin Bhagoji|arXiv (Cornell University)|2018. 02. 18.

Adversarial Robustness in Machine Learning참고 문헌 55인용 수 179

한 줄 요약

본 논문은 자율주행 자동차의 교통 표지판 인식을 대상으로 두 가지 물리적으로 구현 가능한 공격인 signembedding과 lenticular printing을 제시하며, 실세계에서의 높은 성공률과 적대적 학습(adversarial training)과 같은 방어의 한계를 보여준다.

ABSTRACT

Sign recognition is an integral part of autonomous cars. Any misclassification of traffic signs can potentially lead to a multitude of disastrous consequences, ranging from a life-threatening accident to even a large-scale interruption of transportation services relying on autonomous cars. In this paper, we propose and examine security attacks against sign recognition systems for Deceiving Autonomous caRs with Toxic Signs (we call the proposed attacks DARTS). In particular, we introduce two novel methods to create these toxic signs. First, we propose Out-of-Distribution attacks, which expand the scope of adversarial examples by enabling the adversary to generate these starting from an arbitrary point in the image space compared to prior attacks which are restricted to existing training/test data (In-Distribution). Second, we present the Lenticular Printing attack, which relies on an optical phenomenon to deceive the traffic sign recognition system. We extensively evaluate the effectiveness of the proposed attacks in both virtual and real-world settings and consider both white-box and black-box threat models. Our results demonstrate that the proposed attacks are successful under both settings and threat models. We further show that Out-of-Distribution attacks can outperform In-Distribution attacks on classifiers defended using the adversarial training defense, exposing a new attack vector for these defenses.

연구 동기 및 목표

자율주행 자동차의 교통 표지판 인식 시스템이 적대적 조작에 얼마나 취약한지 평가한다.
전통적인 분포 내(인디스트리뷰션) 적대자 범위를 넘어서는 새로운 공격 벡터를 도입한다.
실세계 변환 및 조건 하에서 공격의 물리적 강건성을 입증한다.
화이트박스 및 블랙박스 위협 모델에서의 공격 효과성을 평가하고, 실제 주행 테스트를 포함한다.
이러한 공격에 대한 기존 방어, 특히 적대적 학습(adversarial training)의 한계를 검토한다.

제안 방법

임의의 분포를 벗어난 이미지에서 시작하여 적대적 표지판을 생성하는 signembedding 공격을 제안한다.
각도 의존적 오분류를 생성하기 위해 광학 현상을 이용하는 lenticular printing 공격을 제안한다.
마스크와 미분가능한 변환 집합을 사용하여 물리적으로 강건한 섭동을 생성하기 위한 변환을 포함한 강건 최적화 프레임워크를 개발한다.
실세계 조건을 시뮬레이션하기 위해 마스킹, 크기 조정, 무작위 밝기/원근/크기 변환을 포함하는 공격 파이프라인을 사용한다.
화이트박스 및 블랙박스 설정에서의 공격을 평가하고, 전이성(이전성) 연구 및 드라이브 바이 테스트를 포함한다.

실험 결과

연구 질문

RQ1적대자가 임의의 이미지 입력(분포를 벗어난)에서 시작하여 물리적으로 강건하고 실현 가능한 적대 표지판을 생성할 수 있는가?
RQ2실세계 변환 및 조건에서 signembedding과 advtraffic 공격은 얼마나 효과적인가?
RQ3적대적 예시가 교통 표지판 인식의 최첨단 방어인 adversarial training 등을 무력화하는가?
RQ4lenticular printing과 같은 새로운 물리적 공격이 표지판 인식 시스템을 속일 수 있는 가능성은 어느 정도인가?

주요 결과

signembedding과 lenticular printing을 통해 생성된 적대적 표지판은 다양한 실세계 조건에서 높은 신뢰도의 오분류를 달성한다.
드라이브 바이 실세계 테스트에서 signembedding 및 advtraffic 공격의 성공률이 모두 90%를 넘는 것으로 나타났다.
Signembedding은 적대적 학습 방어에 대해 전통적인 advtraffic 공격보다 우수한 성능을 보이며 새로운 공격 벡터를 드러낸다.
lenticular printing은 시야 각도 의존적 외관을 이용하는 독특한 물리적 공격 벡터를 도입하여 표지판 인식을 속인다.
블랙박스 및 전이 기반 공격은 대상 모델의 세부 정보를 직접 알지 못하더라도 여전히 효과적이다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.