QUICK REVIEW

[论文解读] A Privacy-Preserving Hybrid Federated Learning Framework for Financial Crime Detection

Haobo Zhang, Junyuan Hong|arXiv (Cornell University)|Feb 7, 2023

Imbalanced Data Classification Techniques被引用 12

一句话总结

该论文提出 HyFL，一种面向金融犯罪检测的隐私感知混合式联邦学习框架（垂直+水平），具有安全特征提取和基于噪声的保护，在合成 SWIFT 数据上进行评估。

ABSTRACT

The recent decade witnessed a surge of increase in financial crimes across the public and private sectors, with an average cost of scams of $102m to financial institutions in 2022. Developing a mechanism for battling financial crimes is an impending task that requires in-depth collaboration from multiple institutions, and yet such collaboration imposed significant technical challenges due to the privacy and security requirements of distributed financial data. For example, consider the modern payment network systems, which can generate millions of transactions per day across a large number of global institutions. Training a detection model of fraudulent transactions requires not only secured transactions but also the private account activities of those involved in each transaction from corresponding bank systems. The distributed nature of both samples and features prevents most existing learning systems from being directly adopted to handle the data mining task. In this paper, we collectively address these challenges by proposing a hybrid federated learning system that offers secure and privacy-aware learning and inference for financial crime detection. We conduct extensive empirical studies to evaluate the proposed framework's detection performance and privacy-protection capability, evaluating its robustness against common malicious attacks of collaborative learning. We release our source code at https://github.com/illidanlab/HyFL .

研究动机与目标

动机：在多家机构之间进行协作但又保护隐私的金融犯罪检测的必要性。
提出一种新颖的 HyFL 框架，结合垂直 FL 与水平 FL，以利用交易数据和账户数据。
评估针对模型反演、属性推断和成员推断的隐私风险及防护机制。
在大规模合成数据上展示框架的有效性以及隐私-效用权衡。

提出的方法

引入三类型计算节点的 HyFL 架构：一个交易客户端、多个账户客户端，以及一个中央服务器。
在账户客户端与交易客户端之间使用垂直 FL，将账户派生特征与交易特征融合。
在账户数据上训练自编码器生成特征嵌入，然后将嵌入与交易特征拼接用于最终预测。
对拼接后的特征应用高斯噪声，以在训练和推理阶段防御隐私攻击。
对模型参数进行加密并使用特征提取器以防止属性泄露和模型反演。
提供三阶段训练流程：(i) 用自编码器进行特征学习，(ii) 特征提取，(iii) 在隐私防护下进行分类器训练。

实验结果

研究问题

RQ1混合垂直-水平 FL 框架如何在在保护数据隐私的同时实现有效的金融犯罪检测？
RQ2HyFL 训练和推理中会出现哪些隐私风险，以及如何通过加密、差分隐私和噪声注入来减轻它们？
RQ3在隐私约束下，将账户派生嵌入与交易特征结合对检测性能的影响是多少？

主要发现

HyFL 在通过高斯噪声和加密参数聚合实现隐私保护的同时，取得了较强的检测性能。
该框架通过噪声、编码器和加密的结合，降低了模型反演、成员推断、属性推断和特征泄露风险。
在合成 SWIFT 数据集上的实验表明，该框架可扩展到多达 200 个账户客户端，并以 XGBoost 作为分类器。
该方法同时支持原生 HyFL 与隐私增强 HyFL，后者以略微降低效用换取更高的安全性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。