[論文レビュー] Think Locally, Act Globally: Federated Learning with Local and Global Representations
LG-FedAvg jointly learns compact local representations and a global model to reduce communication while maintaining performance, with theoretical bias-variance analysis and heterogeneous-data robustness.
Federated learning is a method of training models on private data distributed over multiple devices. To keep device data private, the global model is trained by only communicating parameters and updates which poses scalability challenges for large models. To this end, we propose a new federated learning algorithm that jointly learns compact local representations on each device and a global model across all devices. As a result, the global model can be smaller since it only operates on local representations, reducing the number of communicated parameters. Theoretically, we provide a generalization analysis which shows that a combination of local and global models reduces both variance in the data as well as variance across device distributions. Empirically, we demonstrate that local models enable communication-efficient training while retaining performance. We also evaluate on the task of personalized mood prediction from real-world mobile data where privacy is key. Finally, local models handle heterogeneous data from new devices, and learn fair representations that obfuscate protected attributes such as race, age, and gender.
研究の動機と目的
- Motivate federated learning when data is private and distributed across devices with non-i.i.d. distributions.
- Propose LG-FedAvg to learn compact local representations and a global model operating on those representations.
- Provide a theoretical bias-variance analysis showing benefits of combining local and global components.
- Demonstrate empirically that local representations reduce communication while preserving accuracy across tasks.
- Explore applications including personalized mood prediction and fairness-aware representations.
提案手法
- Introduce Local Global Federated Averaging (LG-FedAvg) which jointly trains local encoders on devices and a global model operating on local representations.
- Define a local encoder ell_m that maps x to a compact representation h, and a global model g that maps h to predictions y.
- Formulate a joint loss L_m^g that depends on both local and global parameters (θ_m^ℓ, θ_m^g) and enables end-to-end updates.
- Aggregate updated global parameters across devices via a weighted average by data size N_m (FedAvg-style).
- Provide a theoretical bias-variance decomposition for the federated setting and derive the optimal alpha for mixing local and global models.
- Describe inference strategies for local test and new test scenarios using ensembles over local models.
実験結果
リサーチクエスチョン
- RQ1How does combining local representations with a global model affect generalization under device and data variance?
- RQ2Can LG-FedAvg achieve communication efficiency while maintaining or improving accuracy on heterogeneous and non-i.i.d. data?
- RQ3Does the approach enable fair representations by obfuscating protected attributes?
- RQ4How does LG-FedAvg perform on real-world personalized tasks such as mood prediction from private mobile data?
- RQ5How does the alpha parameter balance local vs. global contributions in practice across domains?
主な発見
| Data | Method | Local Test Acc. (↑) | New Test Acc. (↑) | FedAvg Rounds | LG Rounds | Params Communicated (↓) |
|---|---|---|---|---|---|---|
| CIFAR-10 | FedAvg [38] | 58.99±1.50 | 58.99±1.50 | 1800 | 0 | 12.7×10^9 |
| CIFAR-10 | Local only [50] | 87.93±2.14 | 10.03±0.06 | 0 | 0 | 0 |
| CIFAR-10 | MTL [50] | 89.68±0.75 | 10.06±0.11 | 1800 | 0 | 12.0×10^9 |
| CIFAR-10 | LG-FedAvg (ours) | 91.07±0.50 | 57.95±1.48 | 1200 | 100 | 8.5×10^9 |
| CIFAR-10 | LG-FedAvg (ours) | 91.77±0.56 | 60.79±1.45 | 1800 | 100 | 12.7×10^9 |
- LG-FedAvg outperforms FedAvg and local-only baselines on non-i.i.d. CIFAR-10 and VQA while using fewer communicated parameters.
- The alpha-interpolation between local and global models yields better generalization than either extreme, with an optimal alpha* that improves both data and device variance handling.
- On CIFAR-10 with non-i.i.d. splits, LG-FedAvg achieves higher local-test accuracy and competitive new-test accuracy with reduced communication.
- For VQA, LG-FedAvg reaches competitive local-test accuracy with substantially lower parameter communication.
- In mood-prediction from private mobile data, alpha-splits across local/global models outperform extremes, illustrating personalization plus shared learning.
- LG-FedAvg improves robustness to heterogeneity and reduces catastrophic forgetting when new devices appear.
より良い研究を、今すぐ始めましょう
論文設計から論文執筆まで、研究時間を劇的に削減しましょう。
クレジットカード登録不要
このレビューはAIが作成し、人間の編集者が確認しました。