QUICK REVIEW

[论文解读] Green Federated Learning

Ashkan Yousefpour, Shen Guo|arXiv (Cornell University)|Mar 26, 2023

Privacy-Preserving Technologies in Data被引用 15

一句话总结

论文提供了一个面向生产规模、数据驱动的联邦学习（FL）碳足迹分析，并提出 Green FL 作为一个框架，在保持具有竞争力的性能的同时最小化排放，基于对真实世界 Papaya 基于 FL 系统的测量数据。

ABSTRACT

The rapid progress of AI is fueled by increasingly large and computationally intensive machine learning models and datasets. As a consequence, the amount of compute used in training state-of-the-art models is exponentially increasing (doubling every 10 months between 2015 and 2022), resulting in a large carbon footprint. Federated Learning (FL) - a collaborative machine learning technique for training a centralized model using data of decentralized entities - can also be resource-intensive and have a significant carbon footprint, particularly when deployed at scale. Unlike centralized AI that can reliably tap into renewables at strategically placed data centers, cross-device FL may leverage as many as hundreds of millions of globally distributed end-user devices with diverse energy sources. Green AI is a novel and important research area where carbon footprint is regarded as an evaluation criterion for AI, alongside accuracy, convergence speed, and other metrics. In this paper, we propose the concept of Green FL, which involves optimizing FL parameters and making design choices to minimize carbon emissions consistent with competitive performance and training time. The contributions of this work are two-fold. First, we adopt a data-driven approach to quantify the carbon emissions of FL by directly measuring real-world at-scale FL tasks running on millions of phones. Second, we present challenges, guidelines, and lessons learned from studying the trade-off between energy efficiency, performance, and time-to-train in a production FL system. Our findings offer valuable insights into how FL can reduce its carbon footprint, and they provide a foundation for future research in the area of Green AI.

研究动机与目标

量化跨客户端、服务器和网络的真实世界联邦学习任务的碳排放。
确定生产系统中对 FL 碳足迹的主要贡献者。
理解 FL 中能源效率、训练时间和模型质量之间的权衡。
提供可执行的指南和预测模型，在不牺牲性能的前提下降低 FL 的排放。

提出的方法

对生产 FL 堆栈（Papaya）的所有主要组件进行仪器化和画像，以衡量来自客户端、服务器和网络的排放。
使用字符级语言模型任务，在生产环境中评估带同步（FedAvg）和异步（FedBuff）设置的 FL。
在假定利用率（1%）下，结合 PUE 和碳强度加权，使用客户端功率轮廓（包括 CPU 和 Wi-Fi）以及服务器功率来测量能耗。
分析并发、轮次和聚合对碳足迹与模型性能的贡献。
探索超参数调优（学习率、批大小、本地轮次、并发、聚合目标）在实现目标困惑度的同时尽量降低排放。
开发一个预测模型，在部署前估算 FL 任务的碳足迹。

实验结果

研究问题

RQ1在生产 FL 系统中，客户端、服务器和网络基础设施的碳足迹分布是什么？
RQ2在给定目标困惑度和训练时间的情况下，带同步和异步的 FL 在排放方面有何差异？
RQ3哪些超参数最影响生产中 FL 的碳排放和训练效率？
RQ4我们能否在部署前预测 FL 任务的碳足迹，以指导 Green FL 决策？
RQ5哪些实用指南在不牺牲模型质量或收敛速度的前提下减少 FL 排放？

主要发现

客户端计算和客户端-服务器通信主导 FL 的排放，在整个堆栈中合计贡献约97%。
服务器端计算只是排放的一小部分（约1–2%），上传网络流量约27–29%、下载约22–24%也很重要。
异步 FL 收敛更快，但在拟合到相似精度时，碳排放高于同步 FL。
不同配置在达到相似精度时，碳影响差异可高达 200×，凸显超参数优化的价值。
通过优化器选择和较小的并发来减少训练时间，从而在保持模型质量的同时降低碳足迹。
大规模语言建模 FL 的排放在几天内就可达到大约 5–20 kg CO2e，在模型探索和再训练期间可能会有多倍增幅。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。