QUICK REVIEW

[논문 리뷰] Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning

Zhibo Wang, Mengkai Song|arXiv (Cornell University)|2018. 12. 03.

Privacy-Preserving Technologies in Data참고 문헌 23인용 수 44

한 줄 요약

이 논문은 악의적인 서버로부터 사용자별 데이터를 페더레이티드 러닝에서 학습에 영향을 주지 않으면서 복구하는 다중 작업 GAN 공격인 mGAN-AI를 소개한다. MNIST와 AT&T에서 시연.

ABSTRACT

Federated learning, i.e., a mobile edge computing framework for deep learning, is a recent advance in privacy-preserving machine learning, where the model is trained in a decentralized manner by the clients, i.e., data curators, preventing the server from directly accessing those private data from the clients. This learning mechanism significantly challenges the attack from the server side. Although the state-of-the-art attacking techniques that incorporated the advance of Generative adversarial networks (GANs) could construct class representatives of the global data distribution among all clients, it is still challenging to distinguishably attack a specific client (i.e., user-level privacy leakage), which is a stronger privacy threat to precisely recover the private data from a specific client. This paper gives the first attempt to explore user-level privacy leakage against the federated learning by the attack from a malicious server. We propose a framework incorporating GAN with a multi-task discriminator, which simultaneously discriminates category, reality, and client identity of input samples. The novel discrimination on client identity enables the generator to recover user specified private data. Unlike existing works that tend to interfere the training process of the federated learning, the proposed method works "invisibly" on the server side. The experimental results demonstrate the effectiveness of the proposed attacking approach and the superior to the state-of-the-art.

연구 동기 및 목표

악의적인 서버 관점에서 페더레이티드 러닝의 사용자 수준 프라이버시 누수 연구의 동기를 부여한다.
글로벌 클래스 대표가 아닌 개별 클라이언트를 대상으로 하는 일반적이고 보이지 않는(invisible) 공격 프레임워크(mGAN-AI)를 제안한다.
GAN 학습 중 클라이언트 신원을 구별해 특정 클라이언트의 데이터를 복구 가능하게 한다.
공격 중 페더레이티드 러닝 메커니즘이나 공유 모델을 수정하지 않으면서 학습 유용성을 보존한다.

제안 방법

실제/가짜 구분기(real/fake discriminator), 카테고리 분류기(category classifier), 그리고 클라이언트 신원 구분기(client-identity discriminator)를 갖춘 다중 작업 GAN인 mGAN-AI를 도입한다.
범주(category)와 클라이언트 신원으로 조건화된 생성기를 학습시켜 피해자 특화 샘플을 생성한다.
업데이트(역전파된 그래디언트)로부터 클라이언트 데이터 대표를 추정하여 신원 구분 작업을 감독한다.
업데이트와 공유 모델을 이용하되 학습에 간섭하지 않는 수동적(invisible) 공격으로 작동한다; 피해자를 분리한(active) 변형을 제공한다.
실제/가짜, 카테고리, 신원 작업에 대한 목적 함수를 수식화하고 D와 G에 대한 업데이트 규칙을 도출한다.
총 변화(variational) 정규화(total variation) 를 이용한 최적화를 통해 업데이트에서 클라이언트 대표 X_k를 계산하는 방법을 기술한다.

실험 결과

연구 질문

RQ1페더레이티드 러닝의 악의적인 서버가 학습 과정을 바꾸지 않으면서도 특정 개별 클라이언트의 데이터를 회복할 수 있는가(사용자 수준 프라이버시)?
RQ2클라이언트 신원 구별 작업으로 GAN 학습을 확장하는 것이 대상 클라이언트의 데이터를 정확하게 재구성하게 만드나?
RQ3제안된 mGAN-AI 프레임워크가 기존의 GAN 기반 공격이나 모델 역전 공격과 비교해 클라이언트 특이 데이터 획득에 어느 정도 효과적인가?

주요 결과

mGAN-AI는 피해자 조건부 샘플 생성을 가능하게 하여 MNIST 및 AT&T 데이터세트에서 특정 클라이언트의 데이터를 회복한다.
수동적(invisible) 공격은 공유 모델이나 학습 절차를 수정하지 않고도 작동할 수 있다.
피해자를 격리하고 전용 공유 모델에서 학습함으로써 활성 변형이 공격 강도를 추가로 향상시킨다.
해당 방법은 클라이언트 업데이트에서 추론된 클라이언트 데이터 대표를 사용해 신원 구분을 감독한다.
이전 공격들과 비교하여, 현실적인 페더레이티드 러닝 환경에서 mGAN-AI가 더 우수한 재구성 능력을 보여준다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.