QUICK REVIEW

[论文解读] Federated Continual Learning with Weighted Inter-client Transfer

Jaehong Yoon, Wonyong Jeong|arXiv (Cornell University)|Mar 6, 2020

Domain Adaptation and Few-Shot Learning参考文献 39被引用 31

一句话总结

FedWeIT 将模型权重分解为全局、基础和任务自适应部分，能够在客户端之间有选择地知识迁移并实现稀疏通信，以提升联邦持续学习。

ABSTRACT

There has been a surge of interest in continual learning and federated learning, both of which are important in deep neural networks in real-world scenarios. Yet little research has been done regarding the scenario where each client learns on a sequence of tasks from a private local data stream. This problem of federated continual learning poses new challenges to continual learning, such as utilizing knowledge from other clients, while preventing interference from irrelevant knowledge. To resolve these issues, we propose a novel federated continual learning framework, Federated Weighted Inter-client Transfer (FedWeIT), which decomposes the network weights into global federated parameters and sparse task-specific parameters, and each client receives selective knowledge from other clients by taking a weighted combination of their task-specific parameters. FedWeIT minimizes interference between incompatible tasks, and also allows positive knowledge transfer across clients during learning. We validate our FedWeIT against existing federated learning and continual learning methods under varying degrees of task similarity across clients, and our model significantly outperforms them with a large reduction in the communication cost. Code is available at https://github.com/wyjeong/FedWeIT

研究动机与目标

Motivate federated continual learning (FCL) and address interference from irrelevant knowledge across clients.
Propose a decomposed parameterization to separate global, base, and task-adaptive knowledge.
Enable selective inter-client knowledge transfer via attention over task-adaptive parameters.
Improve communication efficiency through sparsity while maintaining or improving task performance.
Demonstrate superior performance and faster adaptation across varied task similarity scenarios.

提出的方法

Decompose local models as: θ_c^(t) = B_c^(t) ⊙ m_c^(t) + A_c^(t) + sum_{i≠c} sum_{j<|t|} α_i,j^(t) A_i^(j).
Use a dense global parameter θ_G derived from aggregating sparsified B_c^(t) ⊙ m_c^(t) across clients.
Represent knowledge transfer via sparse, attention-weighted aggregation of task-adaptive parameters A from other clients.
Regularize training with sparsity constraints on masks m_c^(t) and A^(t) and retroactive updates to maintain past task solutions.
Transmit only sparse, high-impact parameters to minimize communication costs.

实验结果

研究问题

RQ1Can federated continual learning benefit from selectively transferring task-specific knowledge between clients?
RQ2How should parameters be decomposed and transmitted to minimize inter-client interference and communication while preserving performance?
RQ3Does Attention-based inter-client transfer improve adaptation speed and final accuracy across diverse task similarities?
RQ4What is the trade-off between communication cost and accuracy in FedWeIT compared to baselines across multiple datasets?
RQ5Is the FedWeIT approach scalable to larger networks and more clients without sacrificing effectiveness?

主要发现

FedWeIT substantially outperforms single-task, continual learning, and naive FCL baselines across Overlapped-CIFAR-100 and NonIID-50 tasks.
FedWeIT achieves faster adaptation to new tasks and lower forgetting due to selective cross-client transfer.
The attention mechanism effectively chooses beneficial task-adaptive parameters from other clients (e.g., matching similar datasets).
FedWeIT reduces communication costs by transmitting highly sparse task-adaptive and base parameters, while maintaining or improving accuracy.
Experiments with ResNet-18 show FedWeIT outperforms APD baselines with fewer parameters.
Across 100 clients, FedWeIT demonstrates strong performance gains and mitigates inter-client interference.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。