QUICK REVIEW

[论文解读] Model Pruning Enables Efficient Federated Learning on Edge Devices

Yuang Jiang, Shiqiang Wang|arXiv (Cornell University)|Sep 26, 2019

Privacy-Preserving Technologies in Data被引用 55

一句话总结

PruneFL 将自适应、分布式模型剪枝整合到联邦学习中，以在边缘设备上减少计算和通信，同时将准确性保持在接近原始模型。

ABSTRACT

Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a datacenter. To overcome this challenge, we propose PruneFL -- a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: (i) we significantly reduce the training time compared to conventional FL and various other pruning-based methods; (ii) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model.

研究动机与目标

在资源受限的边缘设备上推动联邦学习，其中数据隐私与通信成本成为限制因素。
开发基于剪枝的 FL 方法，以共同降低计算和通信开销。
实现自适应、分布式剪枝，在FL过程中跟踪并调整模型大小，以优化训练时间和准确性。
通过在真实边缘设备上实现稀疏矩阵 PruneFL 并在多个数据集上评估来证明其实用性。

提出的方法

提出 PruneFL，采用两阶段分布式剪枝：在选定客户端进行初始剪枝，在 FL 过程中进行进一步剪枝。
使用自适应剪枝，通过最大化每单位训练时间的近似经验风险降低来更新保留的参数。
将剪枝表述为选择一组参数子集，以最大化风险降低与回合时间之比（Γ）。
通过平方梯度 g_j^2 计算参数重要性，并估计每个参数的训练时间 t_j 以指导剪枝决策。
实现剪枝模型的稀疏矩阵表示，以减少存储、内存和通信开销。
在标准 FL 假设下给出收敛性分析，并显示与客户数量成线性加速。

实验结果

研究问题

RQ1自适应、分布式剪枝是否能在不牺牲准确性的前提下同时降低FL中的通信和计算成本？
RQ2在FL过程中应如何调整模型大小以最好地平衡训练速度和最终性能？
RQ3初始剪枝对后续具有异构数据和设备的联邦训练有什么影响？
RQ4剪枝是否会产生近似原始性能的 lottery-ticket 子网，在边缘设备上维持接近原始的表现？
RQ5在真实边缘硬件上部署 PruneFL 的实际实现注意事项有哪些？

主要发现

PruneFL 显著降低了训练时间，相较于传统 FL 及其他基于剪枝的方法。
自动确定大小的剪枝模型收敛到非常接近原始模型的准确性，并形成原始模型的 lottery-ticket。
两阶段分布式剪枝，结合自适应剪枝，比单客户端剪枝更有效地应对数据和设备的异质性。
自适应剪枝使用基于梯度的重要性度量和时间估计来选择保留、剪枝或回补的参数。
在边缘设备上的实现表明，对剪枝模型的卷积层和全连接层进行稀疏矩阵计算是可行的。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。