QUICK REVIEW

[논문 리뷰] Can You Really Backdoor Federated Learning?

Ziteng Sun, Peter Kairouz|arXiv (Cornell University)|2019. 11. 18.

Privacy-Preserving Technologies in Data참고 문헌 31인용 수 368

한 줄 요약

이 논문은 연합 학습에서 백도어(타깃) 모델 업데이트 포이즈닝 공격을 연구하고, 현실적인 비 IID 데이터가 있는 EMNIST 하에서 공격 모델을 분석하며, 노름 클리핑과 약한 차등 프라이버시와 같은 방어를 평가한다.

ABSTRACT

The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to have correctly labeled samples from the targeted tasks. We conduct a comprehensive study of backdoor attacks and defenses for the EMNIST dataset, a real-life, user-partitioned, and non-iid dataset. We observe that in the absence of defenses, the performance of the attack largely depends on the fraction of adversaries present and the "complexity'' of the targeted task. Moreover, we show that norm clipping and "weak'' differential privacy mitigate the attacks without hurting the overall performance. We have implemented the attacks and defenses in TensorFlow Federated (TFF), a TensorFlow framework for federated learning. In open-sourcing our code, our goal is to encourage researchers to contribute new attacks and defenses and evaluate them on standard federated datasets.

연구 동기 및 목표

연합 학습에서 비악의적인 타깃 작업 샘플로 백도어 공격을 동기 부여하고 형식화한다.
공격 성공이 적대자 비율과 타깃 작업의 복잡도에 how를 정량화한다.
현실적인 연합 설정에서 노름 클리핑과 약한 차등 프라이버시와 같은 방어를 평가한다.

제안 방법

공격자들의 강화된 업데이트를 사용하여 백도어 버전으로 모델을 대체하는 모델 업데이트 포이즈닝 공격을 수행한다.
공격 모델에는 손상된 클라이언트의 무작위 샘플링과 고정 주파수 공격자가 포함된다.
다수의 타깃 클라이언트에 걸쳐 타깃 클래스를 잘못 라벨링하는 백도어 작업으로 정의한다(예: 7을 1로).
방어 메커니즘으로 노름 기반 업데이트 클리핑 및 가우시안 노이즈 추가(약한 DP)를 포함한다.
TensorFlow Federated에서 5-layer CNN을 사용한 EMNIST에 대해 백도어 작업 수와 공격자 비율을 달리하며 실험을 수행한다.

실험 결과

연구 질문

RQ1현실적인 EMNIST 분포에서 연합 학습에서의 백도어(타깃) 모델 업데이트 포이즈닝 공격은 얼마나 효과적인가?
RQ2공격자 비율과 백도어 작업의 복잡도는 공격 성공률에 어떤 영향을 미치는가?
RQ3노름 클리핑과 약한 차등 프라이버시가 주요 작업 성능에 심각한 악영향을 주지 않으면서 백도어 공격을 완화할 수 있는가?
RQ4백도어 작업의 수가 연합 학습에서 모델을 백도어로 만드는 능력에 어떤 영향을 주는가?
RQ5무작위 샘플링과 고정 주파수 공격자 모델은 효과 면에서 어떻게 비교되는가?

주요 결과

백도어 공격의 성공은 시스템 내 적대자 비율에 크게 의존한다.
손상된 클라이언트의 비율이 무시할 수 없는 수준일 때만 공격이 효과적이며(예: 1% 미만은 효과를 감소시킴).
노름 클리핑은 업데이트의 노름을 한정지켜 백도어 성공률을 크게 낮춘다.
작은 양의 가우시안 노이즈 추가(약한 DP)는 주된 작업 성능에 제한된 영향을 주면서 공격을 추가로 완화한다.
백도어 작업 수를 늘리면 악의적인 모델을 적합시키면서도 주된 작업 정확도를 유지하는 것이 더 어렵다.
고정 주파수 공격은 실험에서 무작위 샘플링보다 약간 더 효과적이다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.