Skip to main content
QUICK REVIEW

[论文解读] Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning

Jinhyun So, Başak Güler|arXiv (Cornell University)|Feb 11, 2020
Privacy-Preserving Technologies in Data参考文献 57被引用 31
一句话总结

Turbo-Aggregate 将联邦学习中的安全聚合开销从 O(N^2) 降低到 O(N log N),并对 50% 的掉线具有鲁棒性,在最多 200 个用户时可实现高达 40 倍的加速。

ABSTRACT

Federated learning is a distributed framework for training machine learning models over the data residing at mobile devices, while protecting the privacy of individual users. A major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation across many users. In particular, the overhead of the state-of-the-art protocols for secure model aggregation grows quadratically with the number of users. In this paper, we propose the first secure aggregation framework, named Turbo-Aggregate, that in a network with $N$ users achieves a secure aggregation overhead of $O(N\log{N})$, as opposed to $O(N^2)$, while tolerating up to a user dropout rate of $50\%$. Turbo-Aggregate employs a multi-group circular strategy for efficient model aggregation, and leverages additive secret sharing and novel coding techniques for injecting aggregation redundancy in order to handle user dropouts while guaranteeing user privacy. We experimentally demonstrate that Turbo-Aggregate achieves a total running time that grows almost linear in the number of users, and provides up to $40 imes$ speedup over the state-of-the-art protocols with up to $N=200$ users. Our experiments also demonstrate the impact of model size and bandwidth on the performance of Turbo-Aggregate.

研究动机与目标

  • Motivate the need for scalable secure aggregation in federated learning with many users.
  • Propose an aggregation framework that reduces overhead from quadratic to near-linear in N.
  • Ensure privacy of individual updates under strong collusion and dropout scenarios.
  • Demonstrate robustness to up to 50% user dropout while preserving accuracy.
  • Empirically evaluate performance and show substantial speedups and bandwidth sensitivity.

提出的方法

  • Introduce a multi-group circular aggregation structure to divide users into L groups and perform staged aggregation.
  • Use additive secret sharing to mask individual models and protect privacy against colluding servers/users.
  • Incorporate Lagrange coding to inject aggregation redundancy, enabling recovery under dropouts via polynomial interpolation.
  • Support both centralized and decentralized (peer-to-peer) communication architectures.
  • Provide a formal analysis of aggregation overhead, dropout robustness, and privacy guarantees.
  • Show experimental evaluation up to N=200 users with nearly linear running time and up to 40× speedup over the state-of-the-art.

实验结果

研究问题

  • RQ1Can the aggregation overhead in secure federated learning be reduced from O(N^2) to O(N log N) without sacrificing privacy?
  • RQ2What dropout robustness and collusion resilience levels are achievable in a scalable secure aggregation protocol?
  • RQ3How can additive secret sharing and coding techniques enable reliable aggregation in the presence of user dropouts?
  • RQ4Do centralized and decentralized communication architectures support Turbo-Aggregate with same privacy and performance guarantees?
  • RQ5What is the practical performance impact (e.g., running time, bandwidth sensitivity) of Turbo-Aggregate on realistic N (up to hundreds)?

主要发现

  • Aggregation overhead is achieved as O(N log N), significantly reducing communication/computation costs.
  • Protocol tolerates up to 50% user dropout with high probability across stages.
  • Strong information-theoretic privacy for individual updates under collusion up to T = N/2.
  • Experimental results on Amazon EC2 show almost linear running time with N, and up to 40× speedup over the prior state-of-the-art for N = 200.
  • Bandwidth constraints affect performance but Turbo-Aggregate maintains substantial gains under limited bandwidth.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。