QUICK REVIEW

[论文解读] Performance Analysis and Optimization in Privacy-Preserving Federated Learning

Kang Wei, Jun Li|arXiv (Cornell University)|Feb 29, 2020

Privacy-Preserving Technologies in Data参考文献 18被引用 11

一句话总结

本文提出了一种用于联邦学习的客户端级差分隐私（CDP）框架，通过在模型更新中添加受控噪声来增强隐私性，同时保持训练效率。通过推导理论收敛上界并引入通信轮次折扣（CRD）方法，该方法在隐私、模型准确率和通信成本之间实现了最优权衡，在固定隐私预算下显著提升了联邦学习性能。

ABSTRACT

As a means of decentralized machine learning, federated learning (FL) has recently drawn considerable attentions. One of the prominent advantages of FL is its capability of preventing clients' data from being directly exposed to external adversaries. Nevertheless, via a viewpoint of information theory, it is still possible for an attacker to steal private information from eavesdropping upon the shared models uploaded by FL clients. In order to address this problem, we develop a novel privacy preserving FL framework based on the concept of differential privacy (DP). To be specific, we first borrow the concept of local DP and introduce a client-level DP (CDP) by adding artificial noises to the shared models before uploading them to servers. Then, we prove that our proposed CDP algorithm can satisfy the DP guarantee with adjustable privacy protection levels by varying the variances of the artificial noises. More importantly, we derive a theoretical convergence upper-bound of the CDP algorithm. Our derived upper-bound reveals that there exists an optimal number of communication rounds to achieve the best convergence performance in terms of loss function values for a given privacy protection level. Furthermore, to obtain this optimal number of communication rounds, which cannot be derived in a closed-form expression, we propose a communication rounds discounting (CRD) method. Compared with the heuristic searching method, our proposed CRD can achieve a much better trade-off between the computational complexity of searching for the optimal number and the convergence performance. Extensive experiments indicate that our CDP algorithm with an optimization on the number of communication rounds using the proposed CRD can effectively improve both the FL training efficiency and FL model quality for a given privacy protection level.

研究动机与目标

解决联邦学习中因共享模型更新而可能泄露私有客户端数据的模型反演攻击风险。
开发一种客户端级差分隐私（CDP）机制，在模型更新中添加人工噪声，以确保形式化的隐私保障。
理论上分析在不同噪声方差和隐私预算下CDP算法的收敛行为。
针对给定的隐私水平，识别能最大化模型收敛性能的最优通信轮次数。
提出一种通信轮次折扣（CRD）方法，以高效地找到最优轮次数，而无需依赖启发式搜索。

提出的方法

通过在上传至服务器前向模型更新中注入拉普拉斯或高斯噪声，引入客户端级差分隐私（CDP）。
证明CDP机制满足(ε, δ)-差分隐私，且隐私参数可通过噪声方差进行调节。
推导CDP算法收敛性的理论收敛上界，以损失函数形式表示，揭示隐私与收敛速度之间的权衡。
从收敛上界出发，将最优通信轮次数公式化为隐私预算和模型复杂度的函数。
提出通信轮次折扣（CRD）方法，以高效近似最优轮次数，避免穷举搜索。
将CRD集成到训练流程中，动态调整通信频率，从而提升训练效率。

实验结果

研究问题

RQ1客户端级差分隐私能否有效应用于联邦学习，以防止模型反演攻击，同时保持模型效用？
RQ2向客户端模型更新中添加噪声如何影响联邦学习中的收敛速度和最终模型性能？
RQ3是否存在一个最优的通信轮次数，可使给定隐私预算下的收敛性能最大化？
RQ4能否为最优通信轮次数推导出闭式解，还是必须依赖近似方法？
RQ5所提出的CRD方法是否在计算成本与收敛性能之间实现更优平衡，优于启发式搜索？

主要发现

所提出的CDP框架满足(ε, δ)-差分隐私，隐私保障可通过调节注入噪声的方差进行调整。
理论收敛上界表明，在固定隐私水平下，存在一个最优通信轮次数，可使损失函数值最小化。
CRD方法在计算成本与收敛性能之间的权衡上，显著优于启发式搜索方法。
大量实验表明，在相同隐私预算下，采用CRD优化的CDP算法能同时提升训练效率和最终模型准确率。
最优通信轮次数无法通过闭式解直接求解，因此必须依赖近似方法（如CRD）。
该方法有效平衡了隐私性、模型质量与通信效率，展示了在真实联邦学习系统中的实际可行性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。