QUICK REVIEW

[논문 리뷰] Learning to Detect Malicious Clients for Robust Federated Learning

Suyi Li, Yong Cheng|arXiv (Cornell University)|2020. 02. 01.

Privacy-Preserving Technologies in Data참고 문헌 24인용 수 185

한 줄 요약

이 논문은 federated learning에서 악의적 클라이언트 업데이트를 식별하고 제거하기 위해 서버 수준의 스펙트럴 이상 탐지 프레임워크를 도입하고, untargeted (Byzantine) 및 targeted (backdoor) 공격으로부터 방어한다.

ABSTRACT

Federated learning systems are vulnerable to attacks from malicious clients. As the central server in the system cannot govern the behaviors of the clients, a rogue client may initiate an attack by sending malicious model updates to the server, so as to degrade the learning performance or enforce targeted model poisoning attacks (a.k.a. backdoor attacks). Therefore, timely detecting these malicious model updates and the underlying attackers becomes critically important. In this work, we propose a new framework for robust federated learning where the central server learns to detect and remove the malicious model updates using a powerful detection model, leading to targeted defense. We evaluate our solution in both image classification and sentiment analysis tasks with a variety of machine learning models. Experimental results show that our solution ensures robust federated learning that is resilient to both the Byzantine attacks and the targeted model poisoning attacks.

연구 동기 및 목표

모델을 저하시켜 오염시킬 수 있는 악의적 클라이언트 업데이트를 다루어 강건한 연합 학습을 촉진한다.
저차원 임베딩에서 악의적 업데이트를 탐지하기 위한 스펙트럴 이상 탐지 프레임워크를 제안한다.
악의적 업데이트를 탐지하고 제거하면 다양한 태스크와 모델에서 강건한 성능이 가능함을 보여준다.

제안 방법

Variational autoencoder(VAE)로 모델 업데이트를 저차원 잠재 공간에 임베드한다.
공개 데이터를 사용한 정상 업데이트(편향되지 않은 업데이트)로 VAE를 학습시켜 재구성오차 기반 이상 탐지 가능하도록 한다.
라운드별로 서버에서 동적 임계값을 적용하여 폐기할 업데이트를 결정한다.
로컬 데이터 크기에 기반한 FedAvg 식 가중치로 비정상 업데이트만 집계에서 제외한다.
악의적 업데이트 영향의 이론적 직관을 선형 모델 분석(Theorem 1)으로 제공한다.
이미지 분류 및 감정 분석 작업에서 표준 방어기법(GEOmed, Krum)과 벤치마크를 수행한다.

실험 결과

연구 질문

RQ1FL에서 비동질성 데이터 하에서 스펙트럴 이상 탐지가 악의적 업데이트와 정상 업데이트를 구분할 수 있는가?
RQ2악의적 업데이트를 탐지하고 제거하면 비목표적 및 백도어 공격에 대한 강건성이 아키텍처와 데이터셋 전반에서 향상되는가?
RQ3제안된 방법이 실무에서 기존 Byzantine-robust 집계기와 어떻게 비교되는가?

주요 결과

데이터셋	Additive Noise (Attack)	Sign-flipping (Attack)	Backdoor (Attack)
FEMNIST	1.00	0.97	0.87
MNIST	1.00	0.99	1.00
Sentiment140	1.00	1.00	0.93

제안 방법은 MNIST, FEMNIST, 및 Sentiment140에서 부호 반전, 가산 잡음, 백도어 공격에 대해 baselines보다 더 높은 강건성을 보인다.
탐지 모델은 정상 클라이언트 vs 악의적 클라이언트를 구분하는 데 높은 F1 스코어를 달성한다(F1: FEMNIST 1.00/0.97/0.87; MNIST 1.00/0.99/1.00; Sentiment140 1.00/1.00/0.93).
재구성 오차 기반의 동적 임계값 설정으로 각 라운드에서 수작업 조정 없이 악의적 업데이트를 효과적으로 제거한다.
이 접근법은 FedAvg와 유사한 수렴 특성을 유지하면서도 적대자에 대한 표적 방어를 제공한다.
Krum, GeoMed에 비해 제안하는 방법이 여러 데이터셋에서 표적 백도어 공격에 대해 더 강력한 방어를 제공한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.