QUICK REVIEW

[論文レビュー] FedCLIP: Fast Generalization and Personalization for CLIP in Federated Learning

Lu Wang, Xixu Hu|arXiv (Cornell University)|Feb 27, 2023

Privacy-Preserving Technologies in Data被引用数 19

ひとこと要約

FedCLIP は federated learning において CLIP の上に注意機構を持つアダプターを用い、見知らぬクライアントへの迅速な一般化と個別化された性能を実現し、トレーニングと通信コストを大幅に削減します。

ABSTRACT

Federated learning (FL) has emerged as a new paradigm for privacy-preserving computation in recent years. Unfortunately, FL faces two critical challenges that hinder its actual performance: data distribution heterogeneity and high resource costs brought by large foundation models. Specifically, the non-IID data in different clients make existing FL algorithms hard to converge while the high resource costs, including computational and communication costs that increase the deployment difficulty in real-world scenarios. In this paper, we propose an effective yet simple method, named FedCLIP, to achieve fast generalization and personalization for CLIP in federated learning. Concretely, we design an attention-based adapter for the large model, CLIP, and the rest operations merely depend on adapters. Lightweight adapters can make the most use of pretrained model information and ensure models be adaptive for clients in specific tasks. Simultaneously, small-scale operations can mitigate the computational burden and communication burden caused by large models. Extensive experiments are conducted on three datasets with distribution shifts. Qualitative and quantitative results demonstrate that FedCLIP significantly outperforms other baselines (9% overall improvements on PACS) and effectively reduces computational and communication costs (283x faster than FedAVG). Our code will be available at: https://github.com/microsoft/PersonalizedFL.

研究の動機と目的

CLIP のような大規模事前学習モデルを使用する際の FL におけるデータ分布の不均一性に対処する。
全モデルを微調整せずに、クライアント間で CLIP を個別化する軽量で適応性のあるメカニズムを開発する。
事前学習済みモデルの知識を保持し、堅牢な特徴表現を維持しつつタスク特化の適応を可能にする。
全モデルを用いたフェデレーテッド学習と比較して計算および通信の負荷を低減する。

提案手法

CLIP の画像エンコーダ用の注意機構ベースのアダプター AttAI を導入する。
各クライアントで事前学習済み CLIP モデルから固定特徴 I = f^I(x) および T = f^T(y) を抽出する。
ローカルアダプター g_i を D_i^train 上で訓練して注意ベクター att = g(I) を生成し、I* = att ∘ I を更新する。
正規化された I* および T をスケール s で用いて CLIP 風のロジットを計算し、CLIP の損失式に従う。
サーバー上でアダプターのパラメータを加重平均 w^g* = sum_i (n_i / sum_j n_j) w_i^g で集約し、クライアントに再配布する。
FedCLIP の最適化ではアダプターのみを更新し、全モデル FL と比較して学習可能パラメータと通信を削減する。

実験結果

リサーチクエスチョン

RQ1事前学習済み CLIP モデルの上にある軽量アダプターは、FL において unseen クライアントへの一般化と参加クライアントへの個人化の両方を達成できるか？
RQ2フェデレーテッド設定で full CLIP backbones を更新する場合と比較して、アダプターはどれほど計算量と通信を削減できるか？
RQ3タスク固有の注意を用いた固定 CLIP 特徴の使用が、クライアント間の一般化と局所的な性能に与える影響はどのようか？

主な発見

FedCLIP は PACS および Office-Home データセットにおける一般化の平均向上をベースラインと比較して約 9% 向上させる。
FedCLIP は CLIP を用いた FedAVG よりも trainable parameters を283倍削減し、計算および通信コストの大幅な低減を実現する。
タスクを超えて、FedCLIP は強力な個別化と一般化を示し、個々のクライアントで最高またはほぼ最高の性能を達成し、全体の平均でも優れた成績を示す。
3 つの公開画像ベンチマークでの実験は、CLIP ベースのバックボーンと AlexNet ベースのベースラインに対する方法の堅牢性と有効性を示している。
このアプローチは BERT や ViT など他のアーキテクチャにも拡張可能で、アダプターベースの設計により実装が実用的である。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。