QUICK REVIEW

[論文レビュー] Federated Optimization: Distributed Machine Learning for On-Device Intelligence

Jakub Konečný, H. Brendan McMahan|arXiv (Cornell University)|Oct 8, 2016

Stochastic Gradient Optimization Techniques参考文献 68被引用数 1,652

ひとこと要約

本論文は、多数のデバイスにまたがる masssively distributed, non-IID, unbalanced data に対する Federated Optimization を導入し、疎な凸問題に適した新しいアルゴリズムを提案し、通信ラウンド数の最小化に向けた有望な実験結果を示す。

ABSTRACT

We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizing the number of rounds of communication is the principal goal. A motivating example arises when we keep the training data locally on users' mobile devices instead of logging it to a data center for training. In federated optimziation, the devices are used as compute nodes performing computation on their local data in order to update a global model. We suppose that we have extremely large number of devices in the network --- as many as the number of users of a given service, each of which has only a tiny fraction of the total data available. In particular, we expect the number of data points available locally to be much smaller than the number of devices. Additionally, since different users generate data with different patterns, it is reasonable to assume that no device has a representative sample of the overall distribution. We show that existing algorithms are not suitable for this setting, and propose a new algorithm which shows encouraging experimental results for sparse convex problems. This work also sets a path for future research needed in the context of \federated optimization.

研究の動機と目的

Federated Optimization の設定を強調し、データが大規模に分散され、非 IID に分布していることを多くのノードにわたって示す。
この設定における既存の分散最適化手法の制限を特定する。
疎で分散データに適した新規アルゴリズムを提案し、その通信効率を評価する。
フェデレーテッド環境で、中心化モデルを少数の通信ラウンドで訓練できることを示す。

提案手法

データが非常に多数のノードに分散され、局所計算を伴うフェデレーテッド最適化問題を定式化する。
IID サンプルやデータの強い集中化に依存しない新しい分散最適化アプローチを開発する。
疎性構造を活用して、フェデレーテッド最適化の効果的なアルゴリズムを設計する。
デバイス上で強力な局所計算を許しつつ、通信ラウンドの回数を最小化することに焦点を当てる。
更新を中心サーバへ送信する小さな delta ベクトルとしてフレーム化し、ペイロードサイズとプライバシーの懸念を低減する。

実験結果

リサーチクエスチョン

RQ1大規模に分散された非 IID および不均衡データ条件の下で、Federated Optimization は高品質な中心化モデルへ収束できるか？
RQ2疎データに特に焦点を当て、フェデレーテッド設定で通信効率を達成するために、どのようなアルゴリズムの変更が必要か？
RQ3オンデバイス学習環境における疎性が、分散最適化の設計と性能にどのような影響を与えるか？
RQ4更新のみでモデルを訓練する場合のプライバシーと通信の実務的な影響は何か？

主な発見

データが大規模に分散され、非 IID かつ多くのデバイス間で不均衡である新しい Federated Optimization の設定が定義されている。
既存のアルゴリズムは Federated Optimization に適していないことが多く、提案手法を動機づける。
提案アルゴリズムは、疎凸問題に対して有望な実験結果を示し、低通信収束の可能性を示している。
デバイス局所計算と小さな delta 更新を用いることで、通信ラウンドを大幅に削減できる。
データ局在性とプライバシー配慮を維持しつつ、オンデバイス学習と集中モデル集約をサポートするフレームワーク。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。