QUICK REVIEW

[논문 리뷰] Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Ahmed Salem, Apratim Bhattacharya|arXiv (Cornell University)|2019. 04. 01.

Adversarial Robustness in Machine Learning인용 수 89

한 줄 요약

이 논문은 업데이트 전후의 블랙박스 모델 출력 차이가 업데이트 데이터에 대한 정보를 누출할 수 있으며, 업데이트 집합을 추정하거나 재구성하기 위한 4가지 인코더-디코더 공격을 도입한다.

ABSTRACT

Machine learning (ML) has progressed rapidly during the past decade and the major factor that drives such development is the unprecedented large-scale data. As data generation is a continuous process, this leads to ML model owners updating their models frequently with newly-collected data in an online learning scenario. In consequence, if an ML model is queried with the same set of data samples at two different points in time, it will provide different results. In this paper, we investigate whether the change in the output of a black-box ML model before and after being updated can leak information of the dataset used to perform the update, namely the updating set. This constitutes a new attack surface against black-box ML models and such information leakage may compromise the intellectual property and data privacy of the ML model owner. We propose four attacks following an encoder-decoder formulation, which allows inferring diverse information of the updating set. Our new attacks are facilitated by state-of-the-art deep learning techniques. In particular, we propose a hybrid generative model (CBM-GAN) that is based on generative adversarial networks (GANs) but includes a reconstructive loss that allows reconstructing accurate samples. Our experiments show that the proposed attacks achieve strong performance.

연구 동기 및 목표

블랙박스 접근 하에서 온라인 학습의 업데이트 세트 누수 위험을 동기 부여하고 형식화한다.
후방 차이로부터 속성을 추정하거나 업데이트 데이터를 재구성하는 네 가지 공격을 제안한다.
다양한 업데이트 세트 정보를 활용하기 위해 사후 차이를 활용하는 인코더-디코더 아키텍처를 개발한다.

제안 방법

사후 차이를 입력으로 사용하는 일반적인 인코더-디코더 공격 파이프라인을 형식화한다.
공격 학습용 그라운드 트루스 데이터를 생성하기 위해 shadow-model 접근법을 활용한다.
단일 샘플 공격에는 라벨 추론과 샘플 재구성이 포함된다.
다중 샘플 공격에는 라벨 분포 추정 및 업데이트 세트의 재구성이 포함된다.
다중 업데이트 샘플을 재구성하기 위해 조건부 베스트-오브-메니 GAN인 CBM-GAN을 도입한다.
MNIST, CIFAR-10, 및 Insta-NY에서 100개의 샘플로 구성된 프로빙 세트를 사용하여 공격을 평가한다.

실험 결과

연구 질문

RQ1업데이트 후 대상 모델의 출력 차이가 업데이트 세트에 대한 정보를 누출할 수 있는가?
RQ2블랙박스 공격자가 인코더-디코더 설정을 사용하여 업데이트 세트의 라벨을 추정하거나 데이터를 재구성하는 효과는 어느 정도인가?
RQ3단일 샘플 업데이트 세트와 다중 샘플 업데이트 세트 간 누출 정도는 어느 정도인가?
RQ4shadow 모델이 블랙박스 제약 하에서 공격 모델의 현실적인 학습을 가능하게 할 수 있는가?
RQ5고급 생성 모델이 사후 차이로부터 업데이트 세트를 얼마나 잘 재구성할 수 있는가?

주요 결과

단일 샘플 라벨 추론 공격은 Insta-NY에서 0.97의 정확도, CIFAR-10에서 0.96, MNIST에서 0.68의 정확도를 달성한다.
단일 샘플 재구성 공격은 무작위 기준선을 능가하고 MNIST/CIFAR-10에서 오토인코더 성능에 근접한다.
다중 샘플 라벨 분포 추정 공격은 KL-발산을 감소시키고 데이터셋 전반에서 무작위 기준선 대비 정확도를 향상시킨다.
CBM-GAN은 사후 차이에 조건화된 업데이트 세트의 다중 샘플 생성이 가능하게 하며 MNIST, CIFAR-10 및 Insta-NY에서 기준선보다 우수한 성능을 보인다.
공격은 shadow-model 학습 및 100샘플 업데이트 세트로 프로빙하더라도 효과가 유지되며, 일부 전이(relaxation)도 탐구된다.
이 프레임워크는 모델 출력 차이가 상당한 업데이트 세트 정보를 누출할 수 있음을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.