QUICK REVIEW

[论文解读] FedMD: Heterogenous Federated Learning via Model Distillation

Daliang Li, Junpu Wang|arXiv (Cornell University)|Oct 8, 2019

Privacy-Preserving Technologies in Data参考文献 17被引用 480

一句话总结

FedMD 通过在公共数据集上进行蒸馏，将参与方使用各自设计的模型的知识转化为联邦学习，从而在独立训练之上取得增益并接近汇聚数据的性能。

ABSTRACT

Federated learning enables the creation of a powerful centralized model without compromising data privacy of multiple participants. While successful, it does not incorporate the case where each participant independently designs its own model. Due to intellectual property concerns and heterogeneous nature of tasks and data, this is a widespread requirement in applications of federated learning to areas such as health care and AI as a service. In this work, we use transfer learning and knowledge distillation to develop a universal framework that enables federated learning when each agent owns not only their private data, but also uniquely designed models. We test our framework on the MNIST/FEMNIST dataset and the CIFAR10/CIFAR100 dataset and observe fast improvement across all participating models. With 10 distinct participants, the final test accuracy of each model on average receives a 20% gain on top of what's possible without collaboration and is only a few percent lower than the performance each model would have obtained if all private datasets were pooled and made directly available for all participants.

研究动机与目标

在参与方部署各自模型结构的情境下，激发联邦学习的研究兴趣。
提出一个允许模型异质性且不共享私人数据或架构的框架。
利用迁移学习和知识蒸馏实现跨模型协作。
在标准数据集上评估 FedMD，以展示相较于独立训练的性能提升。

提出的方法

使用公共数据集作为共同的通信基础。
每一方在公共数据上预训练其自有模型，然后在私有数据上进行微调（迁移学习）。
模型在公共数据上共享类别分数；中心服务器对这些分数进行平均以形成共识。
每个参与方更新其模型以在公共数据上对齐共识（蒸馏）。
重复摘要并重新评估步骤，为了提高效率可对公共数据进行偶尔子采样。
在形成共识时可对参与方进行不同权重分配（可选）。

实验结果

研究问题

RQ1在不分享数据或架构的前提下，异质模型能否在联邦学习中协同工作？
RQ2如何在不同模型之间翻译知识以提升各自的表现？
RQ3相对于独立训练和汇聚数据上界，能够达到哪些性能提升？
RQ4在参与方数据分布独立同分布（i.i.d.）和非独立同分布（non-i.i.d.）情形下，框架表现如何？

主要发现

在 MNIST/FEMNIST 与 CIFAR10/CIFAR100 的实验中，FedMD 相较于孤立的迁移学习取得显著提升。
在 10 个参与方的设置下，最终测试精度相对于非协作基线平均提升约 20%。
性能接近汇聚私有数据的上界，只相差几个百分点。
初步结果表明，在 MNIST 上的协作前精度通常约为 99%，CIFAR10 上约为 76%。
FedMD 在 i.i.d. 与非 i.i.d. 情况下均保持有效，且模型架构存在异质性。
在某些情况下，某些简单模型在 FedMD 框架内也能实现具有竞争力甚至更高的性能。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。