QUICK REVIEW

[论文解读] Scalable and Differentially Private Distributed Aggregation in the Shuffled Model

Badih Ghazi, Rasmus Pagh|arXiv (Cornell University)|Jun 19, 2019

Privacy-Preserving Technologies in Data参考文献 15被引用 68

一句话总结

本文提出了一种可扩展的私有分布式聚合协议，适用于洗牌模型，具有对数多项级的通信和误差增长，并使用零和噪声的隐身披罩实现隐私。

ABSTRACT

Federated learning promises to make machine learning feasible on distributed, private datasets by implementing gradient descent using secure aggregation methods. The idea is to compute a global weight update without revealing the contributions of individual users. Current practical protocols for secure aggregation work in an "honest but curious" setting where a curious adversary observing all communication to and from the server cannot learn any private information assuming the server is honest and follows the protocol. A more scalable and robust primitive for privacy-preserving protocols is shuffling of user data, so as to hide the origin of each data item. Highly scalable and secure protocols for shuffling, so-called mixnets, have been proposed as a primitive for privacy-preserving analytics in the Encode-Shuffle-Analyze framework by Bittau et al., which was later analytically studied by Erlingsson et al. and Cheu et al.. The recent papers by Cheu et al., and Balle et al. have given protocols for secure aggregation that achieve differential privacy guarantees in this "shuffled model". Their protocols come at a cost, though: Either the expected aggregation error or the amount of communication per user scales as a polynomial $n^{Ω(1)}$ in the number of users $n$. In this paper we propose simple and more efficient protocol for aggregation in the shuffled model, where communication as well as error increases only polylogarithmically in $n$. Our new technique is a conceptual "invisibility cloak" that makes users' data almost indistinguishable from random noise while introducing zero distortion on the sum.

研究动机与目标

在分布式数据上进行私有求和以不暴露个体输入为动机。
将洗牌模型聚合的可扩展性提升到超越此前对 n 的多项式依赖。
开发具备低每用户通信量和低聚合误差的协议。
在洗牌模型中提供对不可信或合谋用户的鲁棒性见解。

提出的方法

提出隐身披罩编码器，将每个 x_i 转换为一个由 m 个随机值组成的集合，其总和在缩放和模运算下等于 x_i。
使用一个混洗器对所有编码输出进行随机化，使数据变动保持对差分隐私。
分析器通过聚合混洗输出并应用模减来估计真实和。
引入一个零和噪声技术，在保护最终和的同时隐藏单个输入，使隐私得到保护且聚合结果不失真。
提供两种隐私概念：单个用户变化和保持总和（离散化后总和）变化。
提出洗牌模型的离散隐私分析框架，并推导实现 (ε, δ)-DP 的参数设定。

实验结果

研究问题

RQ1洗牌模型中的聚合是否能够在 n 的对数多项级通信与误差下实现差分隐私，突破 past n^{Ω(1)} 的障碍？
RQ2如何使用零和噪声在洗牌隐私 regime 下隐藏个体输入而不扭曲全局和？
RQ3在保持总和变化与单个用户变化下的隐私保障分别如何，编码器参数如何影响这些保障？
RQ4在洗牌模型框架内，对合谋或不可信用户的隐私保障有多鲁棒？
RQ5哪些参数区间（m, N, k, ε, δ）能给出关于隐私和精度的可实现界？

主要发现

存在一个洗牌模型协议，期望误差为 O(1/ε · sqrt(log(1/δ)))，每用户通信为 O(log(n/(εδ))) 条消息，长度为 O(log(n/δ))。
在保持总和的变更下，有最坏情况误差 2^{-m}，每用户通信为 m 条，长度为 O(m)。
隐身披罩技术使每个用户的数据看起来几乎随机，同时保持整体总和，从而在不扭曲最终聚合的前提下实现 DP。
该方法在用户数量上实现近线性扩展，通信和误差方面，避免了先前工作中的 n^{Ω(1)} 因子。
本工作在保持总和的变化下建立 DP 保证，并讨论对合谋或不可信用户的鲁棒性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。