QUICK REVIEW

[论文解读] SplitFed: When Federated Learning Meets Split Learning

Chandra Thapa, M.A.P. Chamikara|arXiv (Cornell University)|Apr 25, 2020

Privacy-Preserving Technologies in Data被引用 41

一句话总结

SplitFed 将联邦学习和分割学习相结合，在达到类似分割学习的准确性的同时，能够并行客户端训练并通过差分隐私与 PixelDP 提升隐私保护。

ABSTRACT

Federated learning (FL) and split learning (SL) are two popular distributed machine learning approaches. Both follow a model-to-data scenario; clients train and test machine learning models without sharing raw data. SL provides better model privacy than FL due to the machine learning model architecture split between clients and the server. Moreover, the split model makes SL a better option for resource-constrained environments. However, SL performs slower than FL due to the relay-based training across multiple clients. In this regard, this paper presents a novel approach, named splitfed learning (SFL), that amalgamates the two approaches eliminating their inherent drawbacks, along with a refined architectural configuration incorporating differential privacy and PixelDP to enhance data privacy and model robustness. Our analysis and empirical results demonstrate that (pure) SFL provides similar test accuracy and communication efficiency as SL while significantly decreasing its computation time per global epoch than in SL for multiple clients. Furthermore, as in SL, its communication efficiency over FL improves with the number of clients. Besides, the performance of SFL with privacy and robustness measures is further evaluated under extended experimental settings.

研究动机与目标

在数据不能共享的场景下，推动分布式学习与数据隐私的研究动机。
探索结合 FL 与 SL 以利用并行性和降低客户端端计算。
通过差分隐私与 PixelDP 提升隐私和模型鲁棒性。
评估 FL、SL 与 SFL 在准确性、通讯与计算方面的权衡。

提出的方法

提出 SplitFed Learning (SFL)，在客户端端并行计算（类似 FL）的同时，将模型在客户端和服务器之间分割（类似 SL）。
使用一个聚合服务器对客户端端更新执行 FedAvg 并同步全局客户端模型。
通过服务器端模型处理来自客户端的 smashed 数据，并在反向传播中交换梯度。
对客户端端训练应用差分隐私，并在 smashed 数据中整合 PixelDP 噪声以保护隐私并提高鲁棒性。
描述两种变体：SFLV1（通过并行服务器端执行实现服务器端聚合）和 SFLV2（无服务器端 FedAvg 的顺序化服务器端处理）。
提供与 FL 和 SL 的总成本比较，包括通信与训练时间的分析。

实验结果

研究问题

RQ1SFL 是否在多数据集与多体系结构上达到与 SL 相近的模型准确性？
RQ2在保持准确性和通信效率的前提下，SFL 是否比 SL 在全局 epoch 的训练时间上更高效？
RQ3隐私保护机制（DP 与 PixelDP）如何影响 SFL 的准确性与鲁棒性？
RQ4相较于 FL 与 SL，SFL 如何随客户端数量增加而扩展？
RQ5与 FL 与 SL 相比，SFL 各变体在通信和计算方面的权衡是什么？

主要发现

Method	Dataset	Architecture	Normal	FL	SL	SFLV1	SFLV2
Normal	HAM10000	ResNet18	79.3%	77.5%	79.1%	79%	79.2%
Normal	HAM10000	AlexNet	80.1%	75%	73.8%	70.5%	74.9%
Normal	FMNIST	LeNet	92.7%	91.9%	90.4%	89.6%	90.4%
Normal	FMNIST	AlexNet	90.5%	89.7%	84.7%	86%	81%
Normal	CIFAR10	LeNet	72.1%	69.4%	62.7%	62.6%	63.8%
Normal	MNIST	AlexNet	98.8%	98.7%	95.1%	96.9%	92%
Normal	MNIST	ResNet18	99.3%	99.2%	99.2%	99%	99.2%

SFL 在测试集上达到与 SL 相近的准确性，并且由于并行的客户端处理，可能在全局 epoch 上比 SL 更快。
随着客户端数量增多，SFL 在通信效率方面优于 FL，且与 SL 相近。
两种 DP/隐私机制（客户端侧 DP 与 PixelDP）可集成到 SFL 中以提升隐私和鲁棒性，对收敛性和准确性有可观影响。
在多数据集与多架构下，FL 通常表现强劲，但 SFL 在准确性上可与 SL 持平，同时具备并行性带来的优势。
SFLV2 在某些设定下可接近或匹配集中式训练的准确性；SFLV1 在提供隐私增强的同时可实现相当的性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。