QUICK REVIEW

[论文解读] Federated User Representation Learning

Duc Viet Bui, Kshitiz Malik|arXiv (Cornell University)|Sep 27, 2019

Privacy-Preserving Technologies in Data参考文献 27被引用 37

一句话总结

FURL 将模型参数分成联邦部分和私有部分，将用户嵌入保存在设备上以保护隐私，同时在联邦学习中实现接近集中化个性化的性能。

ABSTRACT

Collaborative personalization, such as through learned user representations (embeddings), can improve the prediction accuracy of neural-network-based models significantly. We propose Federated User Representation Learning (FURL), a simple, scalable, privacy-preserving and resource-efficient way to utilize existing neural personalization techniques in the Federated Learning (FL) setting. FURL divides model parameters into federated and private parameters. Private parameters, such as private user embeddings, are trained locally, but unlike federated parameters, they are not transferred to or averaged on the server. We show theoretically that this parameter split does not affect training for most model personalization approaches. Storing user embeddings locally not only preserves user privacy, but also improves memory locality of personalization compared to on-server training. We evaluate FURL on two datasets, demonstrating a significant improvement in model quality with 8% and 51% performance increases, and approximately the same level of performance as centralized training with only 0% and 4% reductions. Furthermore, we show that user embeddings learned in FL and the centralized setting have a very similar structure, indicating that FURL can learn collaboratively through the shared parameters while preserving user privacy.

研究动机与目标

Motivate collaborative personalization and the privacy challenges in FL.
Introduce a parameter-splitting approach that keeps private user embeddings on-device.
Provide conditions under which split personalization does not hurt performance.
Empirically show FL can match centralized personalization with privacy preserved.

提出的方法

Define federated vs private parameters in a neural personalization model.
Prove the split-personalization constraints (independent local training and independent aggregation) that guarantee no performance loss.
Describe the FURL training workflow with Federated Averaging and private parameter updates.
Show that private parameters can be locally trained and simply retained on-device during aggregation.
Evaluate using two document classification datasets with LSTM-based models and user embeddings.
Demonstrate a simple, scalable approach with minimal changes to standard FL.

实验结果

研究问题

RQ1Can personalization techniques that rely on private user embeddings be adapted to FL without sacrificing performance?
RQ2Under what conditions does splitting parameters into federated and private parts preserve model quality?
RQ3How does FURL compare to centralized training in accuracy and privacy preservation?
RQ4Do user embeddings learned in FL resemble those learned centrally in structure?

主要发现

配置	Sticker (AUC)	Subreddit (准确性)
Global Server	57.75%	28.93%
Personalized Server	65.60%	66.13%
Global FL	57.24%	11.90%
Personalized FL	65.63%	62.41%

FURL yields 8% and 51% improvements from personalization on the Sticker and Subreddit datasets, respectively.
In FL, FURL achieves performance close to centralized training with only 0% and 4% reductions on the same datasets.
User embeddings learned in FL exhibit similar structures to those learned in centralized training, as shown by embedding visualizations where similar users cluster together.
Personalization via private embeddings significantly enhances performance in both server-based and FL settings.
Global aggregation time in FURL scales linearly with the number of users, which is favorable compared to quadratic scaling in some alternative approaches.
The training process preserves privacy by keeping private embeddings on-device while sharing only federated parameters.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。