[论文解读] Towards Federated Graph Learning for Collaborative Financial Crimes Detection
本文提出了一种联邦图学习框架,使多家金融机构能够在不共享原始数据的情况下协作检测全球金融犯罪(如洗钱)。通过结合图神经网络与联邦学习,该模型在英国金融行为监管局(UK FCA)TechSprint数据集上的F1得分相比本地模型提升了20%,展示了在保护数据隐私的同时提升检测准确性的能力。
Financial crime is a large and growing problem, in some way touching almost every financial institution. Financial institutions are the front line in the war against financial crime and accordingly, must devote substantial human and technology resources to this effort. Current processes to detect financial misconduct have limitations in their ability to effectively differentiate between malicious behavior and ordinary financial activity. These limitations tend to result in gross over-reporting of suspicious activity that necessitate time-intensive and costly manual review. Advances in technology used in this domain, including machine learning based approaches, can improve upon the effectiveness of financial institutions' existing processes, however, a key challenge that most financial institutions continue to face is that they address financial crimes in isolation without any insight from other firms. Where financial institutions address financial crimes through the lens of their own firm, perpetrators may devise sophisticated strategies that may span across institutions and geographies. Financial institutions continue to work relentlessly to advance their capabilities, forming partnerships across institutions to share insights, patterns and capabilities. These public-private partnerships are subject to stringent regulatory and data privacy requirements, thereby making it difficult to rely on traditional technology solutions. In this paper, we propose a methodology to share key information across institutions by using a federated graph learning platform that enables us to build more accurate machine learning models by leveraging federated learning and also graph learning approaches. We demonstrated that our federated model outperforms local model by 20% with the UK FCA TechSprint data set. This new platform opens up a door to efficiently detecting global money laundering activity.
研究动机与目标
- 通过实现跨机构协作,解决单一机构内金融犯罪检测孤立化的局限性。
- 克服因数据隐私和监管限制而无法在金融机构之间直接共享交易数据的障碍。
- 开发一种可扩展的、保护隐私的机器学习框架,同时利用基于图的特征与联邦学习。
- 使用真实世界基准数据,证明联邦图学习在检测全球金融犯罪模式方面的有效性。
- 通过支持透明、可解释的模型,满足金融监管要求,助力合规。
提出的方法
- 该框架采用联邦学习架构,各金融机构在其自身数据上训练本地模型,仅将模型参数在中央聚合,不共享原始数据。
- 通过拓扑度量(如PageRank)以及局部邻域(egonets)内的可疑主体数量等方法,从交易网络中提取图特征。
- 使用包含两层全连接Sigmoid层及最终Sigmoid输出层的神经网络,通过欠采样多数类以实现数据集平衡,采用二元交叉熵损失进行训练。
- 系统模拟了包含六家银行和一个中央聚合器的多机构设置,采用受Truex等人(2018)启发的联邦学习协议。
- 通过使用平衡的本地测试集和不平衡的全局测试集对模型性能进行评估,以反映现实世界中的数据分布。
- 实验对比了本地模型、加入图特征的本地模型以及全局聚合的联邦模型,以隔离协作带来的影响。
实验结果
研究问题
- RQ1联邦图学习是否能在不共享原始交易数据的前提下,提升多个机构间金融犯罪检测的准确性?
- RQ2基于图的拓扑特征的引入如何影响模型在检测可疑金融活动方面的性能?
- RQ3在现实世界中不平衡的数据集中,通过联邦学习进行协作模型训练在多大程度上优于孤立的本地模型训练?
- RQ4隐私保护的联邦学习框架是否能有效应用于具有监管和数据隐私限制的复杂金融犯罪检测任务?
- RQ5在反映现实条件的组合式不平衡数据集上测试时,全局聚合模型相比本地模型的性能提升有多大?
主要发现
- 联邦图学习模型在全局测试集上的F1得分为0.769,相比各机构本地模型平均0.550的F1得分,提升了20%。
- 在模型中加入图特征后性能显著提升,F1得分从无图特征时的0.550(各机构平均)提高至0.761–0.769(含图特征)。
- 在平衡数据上训练的本地模型在其自身测试集上表现出高准确率(超过95%),但在完整不平衡数据集上评估时性能急剧下降(F1≈0.55),凸显了数据不平衡的挑战。
- 聚合的联邦模型在不平衡的全局测试集上仍保持高性能(F1=0.769),证明了跨机构协作学习的优势。
- 通过欠采样平衡训练数据,提升了模型泛化能力,尤其对少数类(金融犯罪分子)的识别效果显著,其标签仅出现在0.4%的账户中。
- 该框架通过确保机构间不共享原始数据,仅交换模型参数,成功实现了数据隐私保护。
更好的研究,从现在开始
从论文设计到论文写作,大幅缩短您的研究时间。
无需绑定信用卡
本解读由 AI 生成,并经人工编辑审核。