QUICK REVIEW

[논문 리뷰] LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Lichao Sun, Jianwei Qian|arXiv (Cornell University)|2020. 07. 31.

Privacy-Preserving Technologies in Data참고 문헌 29인용 수 32

한 줄 요약

본 논문은 LDP-FL을 소개한다. 로컬 차등 프라이버시를 기반으로 한 연합 학습 체계로, 모델 업데이트의 교란(perturbation), 분할(split), 셔플(shuffling)을 사용하여 MNIST, Fashion-MNIST, CIFAR-10에서 낮은 정확도 손실로 강력한 프라이버시를 달성한다.

ABSTRACT

Train machine learning models on sensitive user data has raised increasing privacy concerns in many areas. Federated learning is a popular approach for privacy protection that collects the local gradient information instead of real data. One way to achieve a strict privacy guarantee is to apply local differential privacy into federated learning. However, previous works do not give a practical solution due to three issues. First, the noisy data is close to its original value with high probability, increasing the risk of information exposure. Second, a large variance is introduced to the estimated average, causing poor accuracy. Last, the privacy budget explodes due to the high dimensionality of weights in deep learning models. In this paper, we proposed a novel design of local differential privacy mechanism for federated learning to address the abovementioned issues. It is capable of making the data more distinct from its original value and introducing lower variance. Moreover, the proposed mechanism bypasses the curse of dimensionality by splitting and shuffling model updates. A series of empirical evaluations on three commonly used datasets, MNIST, Fashion-MNIST and CIFAR-10, demonstrate that our solution can not only achieve superior deep learning performance but also provide a strong privacy guarantee at the same time.

연구 동기 및 목표

민감한 데이터에서 딥 러닝을 배치하는 데 따른 프라이버시 우려를 자극한다.
연합 학습에 적합한 실용적인 LDP 기제를 제공한다.
딥 모델의 차원 저주를 피하면서 정보 노출 및 분산을 줄인다.
표준 데이터셋(MNIST, Fashion-MNIST, CIFAR-10)에서 강한 프라이버시-유용성 트레이드를 시연한다.

제안 방법

각 가중치를 한정된 범위로 교란하는 새로운 로컬 차등 프라이버시 기제를 제안한다.
클라이언트 업데이트 간의 연계를 깨고 프라이버시 예산 증가를 줄이기 위해 분할(split)과 셔플링(shuffle) 기법을 도입한다.
레이어별로 암호화된 최솟값-최댓값을 교환하여 perturbation 범위를 설정하는 적응형 범위 설정을 적용한다.
평균 편향이 제로이고 ε-LDP를 보장하는 두 점 출력으로 각 가중치를 교란한다.
클라우드에서 교란되고 분할되고 셔플된 가중치를 모아서 글로벌 모델을 업데이트한다.
다양한 ε 및 클라이언트 수에서 MNIST, Fashion-MNIST, CIFAR-10에 대한 프라이버시-유용성 트레이드를 평가한다.

실험 결과

연구 질문

RQ1LDP-FL이 성능 저하 없이 연합 학습에서 의미 있는 프라이버시 보장을 제공할 수 있는가?
RQ2가중치 단위 교란이 집계된 모델의 바이어스와 분산에 어떤 영향을 미치는가?
RQ3업데이트의 분할 및 셔플링이 LDP 하에서 딥 러닝의 고차원 문제를 완화하는가?
RQ4표준 벤치마크에서 경쟁력 있는 정확도를 유지하는 실용적인 ε 값은 무엇인가?

주요 결과

LDP-FL은 ε = 1일 때 MNIST에서 0.97%의 정확도 손실을 달성한다.
LDP-FL은 ε = 4일 때 Fashion-MNIST에서 1.32%의 정확도 손실을 달성한다.
LDP-FL은 ε = 10일 때 CIFAR-10에서 1.09%의 정확도 손실을 달성한다.
이 방법은 더 적은 통신 라운드(예: MNIST의 경우 10)로도 기존의 LDP 접근법과 비교해 경쟁력 있는 정확도를 보인다.
이론적 결과로 평균 가중치를 추정하는 데 제로 바이어스이며, 클라이언트 수가 많아질수록 분산 한도가 개선된다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.