QUICK REVIEW

[論文レビュー] Mitigating Byzantine Attacks in Federated Learning.

Saurav Prakash, Amir Salman Avestimehr|arXiv (Cornell University)|Oct 15, 2020

Privacy-Preserving Technologies in Data参考文献 24被引用数 28

ひとこと要約

DiverseFLは、非同一データ分布、変動するByzantine故障、非凸最適化を扱う、新しいByzantine耐性を持つフェデレーテッドラーニングフレームワークを提案する。各クライアントの最小限のデータサンプルに基づいて計算されるクライアント別ガイド勾配を用いる。サーバーはクライアント固有の勾配比較によりByzantineクライアントを特定し、グローバルモデルを非フラグ付きクライアントの勾配のみで更新することで、ベンチマークでOracle SGDに近い性能を達成する。

ABSTRACT

Prior solutions for mitigating Byzantine failures in federated learning, such as element-wise median of the stochastic gradient descent (SGD) based updates from the clients, tend to leverage the similarity of updates from the non-Byzantine clients. However, when data is non-IID, as is typical in mobile networks, the updates received from non-Byzantine clients are quite diverse, resulting in poor convergence performance of such approaches. On the other hand, current algorithms that address heterogeneous data distribution across clients are limited in scope and do not perform well when there is variability in the number and identities of the Byzantine clients, or when general non-convex loss functions are considered. We propose `DiverseFL' that jointly addresses three key challenges of Byzantine resilient federated learning -- (i) non-IID data distribution across clients, (ii) variable Byzantine fault model, and (iii) generalization to non-convex and non-smooth optimization. DiverseFL leverages computing capability of the federated learning server that for each iteration, computes a `guiding' gradient for each client over a tiny sample of data received only once from the client before start of the training. The server uses `per client' criteria for flagging Byzantine clients, by comparing the corresponding guiding gradient with the client's gradient update. The server then updates the model using the gradients received from the non-flagged clients. As we demonstrate in our experiments with benchmark datasets and popular Byzantine attacks, our proposed approach performs better than the prior algorithms, almost matching the performance of the `Oracle SGD', where the server knows the identities of the Byzantine clients.

研究の動機と目的

クライアント間でデータが非同一（non-IID）である状況下で、従来の中央値ベースの集約手法が機能しなくなる問題に対処する。
変動するByzantine故障モデルや一般の非凸損失関数に対して失敗する既存手法の限界を克服する。
Byzantineクライアントの数や特定が予測不能な現実的なフェデレーテッド学習環境でも、耐障害的なモデル学習を可能にする。
Byzantineクライアントの事前知識が不要であり、誠実なクライアント間のデータ類似性を仮定しない形で、収束性と一般化性能を向上させる。

提案手法

訓練開始前に、各クライアントから受信した少量のデータを一回限りのサンプルとして用い、サーバーが各クライアントの「ガイド」勾配を計算する。
各クライアントについて、実際の勾配更新と事前に計算されたガイド勾配との間で、クライアント固有の基準を用いて異常を検出する。
ガイド勾配から著しく逸脱する勾配を示すクライアントは、潜在的なByzantineクライアントとしてフラグ付きとされる。
グローバルモデルは、非フラグ付き（信頼できる）クライアントからの勾配のみで更新され、耐障害性のある集約が保証される。
本手法は、一般の非凸および非滑らか損失関数に対しても対応可能であり、凸な設定に限らない応用範囲を拡張する。
クライアントの訓練中における追加通信を必要とせず、サーバー側の計算を活用することで検出精度を向上させる。

実験結果

リサーチクエスチョン

RQ1従来の中央値ベース手法が失敗する非同一データ分布下でも、Byzantine耐性を持つフェデレーテッドラーニング手法が高パフォーマンスを維持できるか？
RQ2クライアントの数や特定が訓練ラウンドごとに変動する状況下で、クライアント別ガイド勾配機構はByzantineクライアントの検出にどの程度有効か？
RQ3サーバー側での検出メカニズムは、事前にByzantineクライアントを把握できるOracle SGDにどの程度近い性能を達成できるか？
RQ4提案手法は、ディープラーニングで一般的な非凸および非滑らか損失関数の最適化問題にも一般化可能か？
RQ5標準ベンチマークデータセット上での現実的なByzantine攻撃シナリオにおいて、本手法は先行する最先端手法と比較してどの程度優れているか？

主な発見

DiverseFLは、複数のベンチマークデータセットにおいて、Byzantineクライアントを完全に把握できるOracle SGDにほぼ匹敵する収束性能を達成する。
非同一データ設定下では、従来の中央値ベース手法や耐障害性の高い集約手法がクライアント勾配の多様性により性能を低下させる中、DiverseFLは顕著に優れた性能を示す。
Byzantineクライアントの数や特定が訓練ラウンドごとに変動する状況でも、DiverseFLは耐障害性と一貫性のあるパフォーマンスを維持する。
クライアント別ガイド勾配の使用により、誠実なクライアント間の勾配類似性を仮定せず、正確なByzantineクライアント検出が可能になる。
非凸および非滑らか損失関数への強い一般化能力を示し、実世界のディープラーニング応用に適している。
標準フェデレーテッドラーニングベンチマーク上での実験的評価により、DiverseFLは、既存のベースラインと比較して、さまざまなByzantine攻撃下でもモデル精度の低下を顕著に低減することが確認された。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。