QUICK REVIEW

[논문 리뷰] A Comparison Study of Credit Card Fraud Detection: Supervised versus Unsupervised

Xuetong Niu, Li Wang|arXiv (Cornell University)|2019. 04. 24.

Imbalanced Data Classification Techniques참고 문헌 22인용 수 72

한 줄 요약

본 논문은 Kaggle 데이터셋에서 6개의 지도학습 모델과 4개의 비지도학습 모델을 비교하고, AUROC를 5-fold cross-validation으로 사용한 결과, 전체적으로 지도학습 모델이 비지도학습 모델보다 약간 우수하다고 보고한다.

ABSTRACT

Credit card has become popular mode of payment for both online and offline purchase, which leads to increasing daily fraud transactions. An Efficient fraud detection methodology is therefore essential to maintain the reliability of the payment system. In this study, we perform a comparison study of credit card fraud detection by using various supervised and unsupervised approaches. Specifically, 6 supervised classification models, i.e., Logistic Regression (LR), K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Decision Tree (DT), Random Forest (RF), Extreme Gradient Boosting (XGB), as well as 4 unsupervised anomaly detection models, i.e., One-Class SVM (OCSVM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), and Generative Adversarial Networks (GAN), are explored in this study. We train all these models on a public credit card transaction dataset from Kaggle website, which contains 492 frauds out of 284,807 transactions. The labels of the transactions are used for supervised learning models only. The performance of each model is evaluated through 5-fold cross validation in terms of Area Under the Receiver Operating Curves (AUROC). Within supervised approaches, XGB and RF obtain the best performance with AUROC = 0.989 and AUROC = 0.988, respectively. While for unsupervised approaches, RBM achieves the best performance with AUROC = 0.961, followed by GAN with AUROC = 0.954. The experimental results show that supervised models perform slightly better than unsupervised models in this study. Anyway, unsupervised approaches are still promising for credit card fraud transaction detection due to the insufficient annotation and the data imbalance issue in real-world applications.

연구 동기 및 목표

신용카드 사기 탐지를 위한 지도학습 및 비지도학습 머신러닝 모델의 성능을 평가하고 비교한다.
데이터 라벨링, 불균형 및 주석 대기 시간이 모델 성능에 미치는 영향을 평가한다.
실제 대규모 사기 데이터셋에서 어떤 모델 계열(지도학습 vs 비지도학습)이 더 높은 AUROC를 산출하는지 식별한다.

제안 방법

레이블링된 사기 데이터로 6개의 지도학습 모델(LR, KNN, SVM, DT, RF, XGB)을 평가하고 클래스를 균형 맞추기 위해 다운샘플링을 사용한다.
레이블이 없는 데이터로 학습된 4개의 비지도학습 모델(OCSVM, AE, RBM, GAN)을 평가하여 이상치를 탐지한다.
5-fold cross-validation과 AUROC를 성능 지표로 사용한다.
Time과 Amount를 RobustScaler로 정규화하고 비사기 사례를 다운샘플링하여 사기 사례와 동일한 수(492개씩)로 맞춘다.
교차검증 내에서 그리드 서치를 사용하여 하이퍼파라미터를 조정한다.
사용된 AE 및 GAN 아키텍처의 구현 세부 정보를 제공한다.

실험 결과

연구 질문

RQ1실제의 고도로 불균형한 데이터셋에서 신용카드 사기 탐지를 위한 AUROC 측면에서 지도학습 모델이 비지도학습 모델과 어떻게 비교되는가?
RQ2본 연구에서 각 범주(지도학습 vs 비지도학습) 내 최고 성능 모델은 무엇인가?
RQ3레이블링 필요성과 데이터 불균형을 고려할 때 지도학습과 비지도학습 접근법 간의 실용적 트레이드오프는 무엇인가?

주요 결과

모델	AUROC
XGB (Supervised)	0.989–0.990
RF (Supervised)	0.988
DT (Supervised)	0.95
LR (Supervised)	not specified in abstract
KNN (Supervised)	not specified in abstract
SVM (Supervised)	not specified in abstract
RBM (Unsupervised)	0.961
GAN (Unsupervised)	0.954
AE (Unsupervised)	not specified in abstract
OC-SVM (Unsupervised)	0.90

XGBoost (supervised)가 데이터셋에서 가장 높은 AUROC 0.989–0.990을 달성한다.
Random Forest (supervised)가 AUROC 0.988을 달성한다.
Decision Tree (supervised)가 AUROC 0.95로 지도학습 모델 중에서 가장 낮다.
RBM (unsupervised)가 AUROC 0.961로 본 연구에서 비지도학습 중 최고다.
GAN (unsupervised)가 AUROC 0.954를 달성한다.
OC-SVM (unsupervised)가 AUROC 0.90으로 비지도학습 모델 중에서 최저다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.