QUICK REVIEW

[논문 리뷰] Salvaging Federated Learning by Local Adaptation

Changyuan Yu, Eugene Bagdasaryan|arXiv (Cornell University)|2020. 02. 12.

Privacy-Preserving Technologies in Data참고 문헌 43인용 수 68

한 줄 요약

이 논문은 차등 프라이버시와 강건한 집계를 통해 연합학습에서 사용자당 정확도가 감소할 수 있음을 보여주지만, 로컬 적응 기법(미세 조정, 다중 작업 학습, 지식 증류)은 글로벌 FL 프레임워크를 바꾸지 않고도 개별 참가자의 정확도를 회복하고 심지어 향상시킬 수 있다.

ABSTRACT

Federated learning (FL) is a heavily promoted approach for training ML models on sensitive data, e.g., text typed by users on their smartphones. FL is expressly designed for training on data that are unbalanced and non-iid across the participants. To ensure privacy and integrity of the fedeated model, latest FL approaches use differential privacy or robust aggregation. We look at FL from the \emph{local} viewpoint of an individual participant and ask: (1) do participants have an incentive to participate in FL? (2) how can participants \emph{individually} improve the quality of their local models, without re-designing the FL framework and/or involving other participants? First, we show that on standard tasks such as next-word prediction, many participants gain no benefit from FL because the federated model is less accurate on their data than the models they can train locally on their own. Second, we show that differential privacy and robust aggregation make this problem worse by further destroying the accuracy of the federated model for many participants. Then, we evaluate three techniques for local adaptation of federated models: fine-tuning, multi-task learning, and knowledge distillation. We analyze where each is applicable and demonstrate that all participants benefit from local adaptation. Participants whose local models are poor obtain big accuracy improvements over conventional FL. Participants whose local models are better than the federated model extemdash and who have no incentive to participate in FL today extemdash improve less, but sufficiently to make the adapted federated model better than their local models.

연구 동기 및 목표

비 IID 데이터, 프라이버시 및 강건성 보호 하에서 표준 연합 학습이 개별 참가자에게 이익이 되는지 평가한다.
로컬 적응 기법이 FL 집계 프레임워크를 변경하지 않고도 참가자의 모델을 얼마나 개선할 수 있는지 평가한다.
다양한 참가자 데이터 특성과 프라이버시 체계에 대해 어떤 적응 방법이 가장 잘 작동하는지 식별한다.

제안 방법

BASIC-FED, DP-FED, ROBUST-FED를 다음 단어 예측( Reddit ) 및 CIFAR-10 이미지 분류(non-iid Dirichlet 분포)에서 평가한다.
세 가지 로컬 적응 방법을 테스트: 모든 매개변수에 걸친 미세 조정(FT); Freeze-base(FB) 변형; 탄력 가중치 응집을 통한 다중 작업 학습(MTL); 연합 교사로부터 학생으로의 지식 증류(KD).
각 참가자의 데이터에서 적응된 모델을 각 참가자의 로컬 학습 모델 및 비적응 연합 모델과 비교한다.
표준 NLP 및 비전 작업과 신경망 아키텍처를 사용합니다(단어 예측용 2층 LSTM, 단어 예측용 200개의 은닉 유닛; CIFAR-10용 ResNet-18).
참가자별 정확도와 참여 의욕 변화의 원인을 이해하기 위한 집계 추세를 보고한다.

실험 결과

연구 질문

RQ1프라이버시 또는 강건성 보호가 적용된 연합 모델이 각 참가자의 자체 데이터에서 참가자의 로컬 모델을 능가합니까?
RQ2참가자들이 로컬에서 연합 모델을 적응시켜 FL 집계를 바꾸지 않고 정확도를 높일 수 있습니까?
RQ3다양한 참가자 데이터 분포 및 프라이버시 설정에서 어떤 로컬 적응 기법이 정확도를 가장 잘 회복하거나 개선합니까?
RQ4데이터 특성(어휘 크기, 총 단어 수)이 로컬 적응의 효과에 어떤 영향을 줍니까?

주요 결과

프라이버시 및 강건성 보호가 많은 사용자를 대상으로 FL에서 개별 참가자 정확도를 저하시킨다.
적응 기법은 일반적으로 로컬 모델에 비해 개인의 연합 모델 정확도를 회복하고 향상시키는 경우가 많다.
단어 예측에서 적응으로 인한 평균 정확도 향상은 2.32% (BASIC-FED), 2.12% (DP-FED), 2.12% (ROBUST-FED)이다.
이미지 분류에서 적응으로 인한 평균 정확도 향상은 2.98% (BASIC-FED), 6.83% (DP-FED), 6.34% (ROBUST-FED)이다.
대부분의 참가자에 대해 적응된 모델이 로컬 모델보다 우수하며, 초기 로컬 모델이 좋지 않았던 참가자에서 가장 큰 이득이 있다.
적응은 또한 좋은 로컬 모델을 가진 참가자에게도 연합 모델을 개선하여, 많은 경우에 적응된 연합 모델이 그들의 로컬 모델과 경쟁력 있거나 더 나아진다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.