QUICK REVIEW

[論文レビュー] A Privacy-Preserving Hybrid Federated Learning Framework for Financial Crime Detection

Haobo Zhang, Junyuan Hong|arXiv (Cornell University)|Feb 7, 2023

Imbalanced Data Classification Techniques被引用数 12

ひとこと要約

本論文は、HyFLというプライバシーを意識したハイブリッド（垂直＋水平）フェデレーテッドラーニングフレームワークを金融犯罪検知に提案し、セキュアな特徴抽出とノイズベースの保護を特徴とし、合成SWIFTデータで評価している。

ABSTRACT

The recent decade witnessed a surge of increase in financial crimes across the public and private sectors, with an average cost of scams of $102m to financial institutions in 2022. Developing a mechanism for battling financial crimes is an impending task that requires in-depth collaboration from multiple institutions, and yet such collaboration imposed significant technical challenges due to the privacy and security requirements of distributed financial data. For example, consider the modern payment network systems, which can generate millions of transactions per day across a large number of global institutions. Training a detection model of fraudulent transactions requires not only secured transactions but also the private account activities of those involved in each transaction from corresponding bank systems. The distributed nature of both samples and features prevents most existing learning systems from being directly adopted to handle the data mining task. In this paper, we collectively address these challenges by proposing a hybrid federated learning system that offers secure and privacy-aware learning and inference for financial crime detection. We conduct extensive empirical studies to evaluate the proposed framework's detection performance and privacy-protection capability, evaluating its robustness against common malicious attacks of collaborative learning. We release our source code at https://github.com/illidanlab/HyFL .

研究の動機と目的

複数機関間での協調的かつプライバシー保護型の金融犯罪検知の必要性を動機づける。
transactionデータとアカウントデータを活用する垂直＋水平FLを組み合わせた新しいHyFLフレームワークを提案する。
モデル inversion、属性推定、メンバーシップ推定に対するプライバシーリスクと防御機構を評価する。
大規模な合成データ上でフレームワークの効果とプライバシーとユーティリティのトレードオフを示す。

提案手法

3種の計算ノードからなるHyFLアーキテクチャを導入する：トランザクションクライアント、複数のアカウントクライアント、そして中央サーバ。
アカウントクライアントとトランザクションクライアント間で垂直FLを用い、アカウント由来の特徴とトランザクション特徴を融合する。
アカウントデータ上でオートエンコーダを訓練して特徴埋め込みを生成し、埋め込みとトランザクション特徴を結合して最終予測を行う。
訓練および推論時に結合特徴へガウスノイズを適用してプライバシー攻撃から防御する。
モデルパラメータを暗号化し、特徴抽出器を用いて属性漏洩とモデル inversionを保護する。
三段階の訓練パイプラインを提供する：（i）オートエンコーダによる特徴学習、（ii）特徴抽出、（iii）プライバシー保護を組み込んだ分類器訓練。

実験結果

リサーチクエスチョン

RQ1HyFLフレームワークは、データプライバシーを保ちながら効果的な金融犯罪検知をどのように実現できるか？
RQ2HyFL訓練と推論でどのようなプライバシーリスクが生じ、それを暗号化、差分プライバシー、ノイズ注入でどう緩和できるか？
RQ3アカウント由来の埋め込みとトランザクション特徴を組み合わせることが、プライバシー制約下で検知性能にどのような影響を与えるか？

主な発見

HyFLは、ガウスノイズと暗号化されたパラメータ集約を通じて強い検知性能とプライバシー保護のバランスを実現する。
ノイズ、エンコーダ、暗号化の組み合わせにより、モデル inversion、メンバーシップ推定、属性推定、特徴漏えいリスクを緩和する。
合成SWIFTデータセットでの実験は、最大で200アカウントクライアントに拡張可能で、分類器としてXGBoostを活用することを示す。
本手法は vanilla HyFL とプライバシー強化 HyFL の両方をサポートし、後者は若干のユーティリティ低下と引き換えにセキュリティを強化する。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。