QUICK REVIEW

[論文レビュー] Papaya: Practical, Private, and Scalable Federated Learning

Dzmitry Huba, John Nguyen|arXiv (Cornell University)|Nov 8, 2021

Privacy-Preserving Technologies in Data参考文献 30被引用数 29

ひとこと要約

Papayaは AsyncFL を提示します。生産グレードの連合学習システムで、非同期セキュア集約を可能にし、スケール時の同期FLと比べて速度と通信効率で上回ります。並行性によるサーバー更新頻度がほぼリニアで、SyncFL の過剰選択と比較してバイアスを減らしています。

ABSTRACT

Cross-device Federated Learning (FL) is a distributed learning paradigm with several challenges that differentiate it from traditional distributed learning, variability in the system characteristics on each device, and millions of clients coordinating with a central server being primary ones. Most FL systems described in the literature are synchronous - they perform a synchronized aggregation of model updates from individual clients. Scaling synchronous FL is challenging since increasing the number of clients training in parallel leads to diminishing returns in training speed, analogous to large-batch training. Moreover, stragglers hinder synchronous FL training. In this work, we outline a production asynchronous FL system design. Our work tackles the aforementioned issues, sketches of some of the system design challenges and their solutions, and touches upon principles that emerged from building a production FL system for millions of clients. Empirically, we demonstrate that asynchronous FL converges faster than synchronous FL when training across nearly one hundred million devices. In particular, in high concurrency settings, asynchronous FL is 5x faster and has nearly 8x less communication overhead than synchronous FL.

研究の動機と目的

異種性とスラグ（遅延デバイス）に対処する、スケーラブルなクロスデバイスFLの必要性を動機づける。
ゼロ待ちのクライアント更新のためのバッファ付き安全な集約を備えた非同期FL設計（AsyncFL）を提案する。
数百万のデバイスでの生産規模の評価を実証し、収束、スループット、フェアネスを測定する。
クライアントの独立性、高利用率、迅速なモデル集約といった設計上の課題に対処する。

提案手法

ラウンドなしで、集約目標駆動の更新を行う AsyncFL アルゴリズム（FedBuff）を説明する。
更新をマスク・アンマスクするためのトラステッド実行環境を用いた非同期セキュア集約を導入する。
独立したクライアント参加を可能にする2層システム設計（Coordinator、Selector、Aggregator）を提示する。
永続的なアグリゲータとメモリ内キューを備えた高速な並列集約パイプラインを説明する。
ほぼ100%の利用率を維持するクライアント選択と置換メカニズムの詳細を述べる。
生産に近い設定で、トラフィックとスケールを考慮したベンチマーク手法を提供する。

実験結果

リサーチクエスチョン

RQ1大規模での収束速度において、非同期連合学習（AsyncFL）は同期FL（SyncFL）とどう比較されるか？
RQ2非同期セキュア集約は、異種性の下で高利用率と低バイアスを可能にしつつプライバシーを維持できるか？
RQ3数百万台のデバイスでの生産規模の AsyncFL を可能にするシステム設計の選択肢は？

主な発見

方法	全体	75％	99％	時間（時）
SyncFL w/o OS	68.38	66.64	47.82	130.60
SyncFL with OS	72.97	73.10	73.24	18.63
AsyncFL	57.32	55.71	38.51	18.28

AsyncFL は SyncFL より収束が速く、高い同時実行設定で最大 5x の実時間スピードアップを達成します。
AsyncFL は SyncFL と比べて通信オーバーヘッドを最大8x削減します。
AsyncFL は SyncFL より単位時間あたり最大30x 多くのサーバーモデル更新を生成します。
SyncFL のオーバーセレクションは、遅いデバイスとデータ量の多いクライアントに対してサンプリングバイアスを導入し、モデルの公平性を低下させます。
AsyncFL は、非偏りの SyncFL に近い低バイアスを維持しつつ、より速い学習と高いスループットを実現します。
AsyncFL は改善された公平性を示し、遅いデバイスが不当に除外されることが少なくなります。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。