QUICK REVIEW

[论文解读] Robust Federated Learning: The Case of Affine Distribution Shifts

Amirhossein Reisizadeh, Farzan Farnia|arXiv (Cornell University)|Jun 16, 2020

Privacy-Preserving Technologies in Data参考文献 41被引用 64

一句话总结

介绍 FLRA，一种对设备间仿射分布偏移具有鲁棒性的联邦学习框架，通过梯度下降-上升法（FedRobust）求解，具备收敛性与泛化保证。

ABSTRACT

Federated learning is a distributed paradigm that aims at training models using samples distributed across multiple users in a network while keeping the samples on users' devices with the aim of efficiency and protecting users privacy. In such settings, the training data is often statistically heterogeneous and manifests various distribution shifts across users, which degrades the performance of the learnt model. The primary goal of this paper is to develop a robust federated learning algorithm that achieves satisfactory performance against distribution shifts in users' samples. To achieve this goal, we first consider a structured affine distribution shift in users' data that captures the device-dependent data heterogeneity in federated settings. This perturbation model is applicable to various federated learning problems such as image classification where the images undergo device-dependent imperfections, e.g. different intensity, contrast, and brightness. To address affine distribution shifts across users, we propose a Federated Learning framework Robust to Affine distribution shifts (FLRA) that is provably robust against affine Wasserstein shifts to the distribution of observed samples. To solve the FLRA's distributed minimax problem, we propose a fast and efficient optimization method and provide convergence guarantees via a gradient Descent Ascent (GDA) method. We further prove generalization error bounds for the learnt classifier to show proper generalization from empirical distribution of samples to the true underlying distribution. We perform several numerical experiments to empirically support FLRA. We show that an affine distribution shift indeed suffices to significantly decrease the performance of the learnt classifier in a new test user, and our proposed algorithm achieves a significant gain in comparison to standard federated learning and adversarial training methods.

研究动机与目标

Motivate robust federated learning under device-dependent data heterogeneity and affine distribution shifts.
Model user-device data shifts as affine transformations to capture realistic perturbations.
Formulate a minimax robust learning problem and derive a scalable FedRobust solution.
Provide convergence guarantees and generalization bounds for the learned classifier under affine shifts.

提出的方法

Model data at node i as affine-transformed from a universal distribution: x^i -> Lambda^i x^i + delta^i.
Formulate a minimax objective minimizing over w and maximizing over affine parameters (Lambda^i, delta^i) with a penalty lambda to bound deviations.
Develop FedRobust: a gradient descent-ascent algorithm that updates affine parameters locally and the classifier w with periodic server averaging to reduce communication.
Prove convergence to stationary points under PL conditions for two loss classes (PL-PL and nonconvex-PL).
Establish a PAC-Bayes style generalization bound dependent on Lipschitz/smoothness of the classifier and neural network spectral regularization.]
research_questions: [

实验结果

研究问题

RQ1How does affine distribution shift impact federated learning performance across devices?
RQ2Can a minimax formulation with device-specific affine perturbations yield robust classifiers in FL?
RQ3What are the convergence properties of a federated gradient descent-ascent method under affine perturbations?
RQ4What generalization guarantees can be derived for classifiers trained under affine-robust federated objectives?

主要发现

FLRA improves robustness under affine distribution shifts versus standard FedAvg and adversarial training in image tasks.
FedRobust converges to a stationary point of the minimax objective under PL conditions.
The framework yields a generalization bound for multi-layer neural networks based on Lipschitzness and spectral regularization.
Robustness properties connect the minimax objective to distributionally robust optimization with Wasserstein-like transport costs.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。