QUICK REVIEW

[論文レビュー] Optimal Client Sampling for Federated Learning

Wenlin Chen, Samuel Horváth|arXiv (Cornell University)|Oct 26, 2020

Privacy-Preserving Technologies in Data参考文献 47被引用数 95

ひとこと要約

適応的でプライバシーに配慮した最適クライアントサンプリング方式を提案し、連合学習における通信を削減、DSGDとFedAvgの収束保証を提供し、全関与に近い性能を実現。

ABSTRACT

It is well understood that client-master communication can be a primary bottleneck in Federated Learning. In this work, we address this issue with a novel client subsampling scheme, where we restrict the number of clients allowed to communicate their updates back to the master node. In each communication round, all participating clients compute their updates, but only the ones with "important" updates communicate back to the master. We show that importance can be measured using only the norm of the update and give a formula for optimal client participation. This formula minimizes the distance between the full update, where all clients participate, and our limited update, where the number of participating clients is restricted. In addition, we provide a simple algorithm that approximates the optimal formula for client participation, which only requires secure aggregation and thus does not compromise client privacy. We show both theoretically and empirically that for Distributed SGD (DSGD) and Federated Averaging (FedAvg), the performance of our approach can be close to full participation and superior to the baseline where participating clients are sampled uniformly. Moreover, our approach is orthogonal to and compatible with existing methods for reducing communication overhead, such as local methods and communication compression methods.

研究の動機と目的

クロスデバイス連合学習における通信のボトルネックに対処する
有益な更新を選択する適応的なクライアントサンプリング戦略を開発する
セキュアアグリゲーションとステートレスクライアントとの互換性を確保する
凸および非凸設定におけるDSGDとFedAvgの収束保証を提供する

提案手法

部分参加を含有確率 p_i を持つランダム集合 S と独立サンプリングとしてモデル化する
予算 m の下で勾配推定量の分散を最小化する閉形式の最適サンプリング確率 p_i^k を導出する（Equation 7）
個別の更新ノルムを開示せずに近位最適性を維持する近似的なプライバシー保護アルゴリズム（Algorithm 2）を提供する
アグリゲーションのみのアプローチでセキュアアグリゲーションとステートレスクライアントとの互換性を示す
凸性および非凸性の仮定の下でDSGDとFedAvgの収束解析を提示する（Theorems 13–18）

実験結果

リサーチクエスチョン

RQ1固定参加予算 m の下でマスター更新の分散をどのように最小化できるか？
RQ2提案されたサンプリングは、個人情報を漏らすことなくセキュアアグリゲーションとステートレスクライアントで実装できるか？
RQ3最適クライアントサンプリングを用いた場合、凸・非凸の双方の設定でDSGDとFedAvgの収束保証はどうなるか？
RQ4実践的には提案手法は全参加および均一サンプリングとどのように比較されるか？

主な発見

最適サンプリング方式は独立サンプリングを介して参加予算の下で勾配推定量の分散を最小化する。
p_i^k の閉形式解を提供（Equation 7）、プライバシー制約に適した効率的な近似Algorithm 2。
本手法は全参加に近い性能を達成し、理論的にも実験的にも均一サンプリングより優れている。
サンプリングはローカル更新や勾配圧縮法、セキュアアグリゲーションおよびステートレスクライアントと直交的で互換性がある。
提案手法は均一サンプリングベースラインより大きな学習率を許容し、通信効率と収束を改善する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。