QUICK REVIEW

[논문 리뷰] Learning perturbation sets for robust machine learning

Eric Wong, J. Zico Kolter|arXiv (Cornell University)|2020. 07. 16.

Adversarial Robustness in Machine Learning참고 문헌 60인용 수 40

한 줄 요약

이 논문은 조건부 VAE를 사용해 실제 세계의 섭동을 포착하는 데이터 기반의 섭동 집합을 학습하고, 이들이 이론적 강건성 속성을 만족함을 보이며, 다양한 데이터셋에서 적대적 및 인증 가능한 강건성이 향상됨을 입증한다.

ABSTRACT

Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a conditional generator that defines the perturbation set over a constrained region of the latent space. We formulate desirable properties that measure the quality of a learned perturbation set, and theoretically prove that a conditional variational autoencoder naturally satisfies these criteria. Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning.

연구 동기 및 목표

현실 세계의 섭동에 대한 강건성을 전통적인 수학적 위협 모델을 넘어 동기화한다.
학습된 섭동 집합의 바람직한 결정적 및 확률적 속성을 정의한다.
CVAE 기반 섭동 집합이 이 속성들을 이론적으로 만족함을 보인다.
MNIST, CIFAR-10, 다중 조명 작업에서 학습된 섭동 집합을 이용해 강건한 학습과 평가를 가능하게 한다.

제안 방법

생성기 g(z, x)를 통해 잠재공간 변환으로 섭동을 모델링하되 z는 노름 볼( norm ball )로 제약된 형태로 설정한다.
섭동 집합을 평가하기 위한 필요한 부분집합 및 충분한 우도(likelihood) 속성을 형식화한다.
prior로 제약된 잠재공간을 가지는 조건부 자동인코더(CVAE)를 사용해 섭동을 학습한다.
훈련 가정하에 CVAE가 두 가지 핵심 속성을 만족함을 이론적으로 제시한다.
MNIST-RTS, CIFAR10-C, 다중 조명 데이터셋에서 섭동 집합을 다운스트림 강건성 기법과 함께 평가한다.

실험 결과

연구 질문

RQ1CVAE를 통해 paired 데이터로 학습된 섭동 집합이 현실 세계의 섭동을 충실히 커버할 수 있는가?
RQ2CVAE 기반 섭동 집합이 포함성(containment) 및 높은 우도(high-likelihood) 속성을 만족하여 강건한 학습 및 평가에 기여하는가?
RQ3학습된 섭동 집합이 다양한 강건성 태스크에서 적대적 강건성과 일반화에 어떤 영향을 미치는가?

주요 결과

설정	근사 오차	예상 근사 오차	CVAE 재구성 오차	KL
MNIST-RTS	0.11	0.54	0.04	22.2
CIFAR10-C	0.005	0.029	0.001	69.3
MI	0.006	0.049	0.004	65.8

CVAE 기반 섭동 집합은 근삿값 오차가 낮아 섭동된 데이터의 포섭이 우수함을 시사한다.
ECM(Expected CVAE Reconstruction) 및 KL 메트릭은 학습된 섭동이 의미 있는 우도를 가진다는 것을 보여준다.
섭동 집합은 표준 데이터 증강 및 수작업으로 설계된 섭동에 비해 적대적 강건성 및 일반화를 개선한다.
CVAE 섭동을 이용한 적대적 학습은 다른 기준 방법 대비 견고한 정확도가 더 높다.
섭동 학습은 일반적인 손상 및 조명 변화 같은 복합 섭동으로 확장 가능하다.
CIFAR10-C에 대한 정량적 결과는 CVAE 증강 및 적대적 학습으로 견고한 정확도가 향상됨을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.