QUICK REVIEW

[논문 리뷰] Early Rug Pull Warning for BSC Meme Tokens via Multi-Granularity Wash-Trading Pattern Profiling

Dingding Cao, Bianbian Jiao|arXiv (Cornell University)|2026. 03. 14.

Financial Markets and Investment Strategies인용 수 0

한 줄 요약

이 논문은 다중 수준의 워시 트레이딩 패턴을 사용하여 토큰 수준 위험 특징을 구축하고 감독 학습 모델을 적용하여 BSC 밈 토큰에 대한 조기 Rug-pull 경고를 제공하는 엔드-투-엔드 프레임워크를 제시하며, Weak supervision 하에서 Random Forest가 Logistic Regression보다 우수한 성능을 보인다.

ABSTRACT

The high-frequency issuance and short-cycle speculation of meme tokens in decentralized finance (DeFi) have significantly amplified rug-pull risk. Existing approaches still struggle to provide stable early warning under scarce anomalies, incomplete labels, and limited interpretability. To address this issue, an end-to-end warning framework is proposed for BSC meme tokens, consisting of four stages: dataset construction and labeling, wash-trading pattern feature modeling, risk prediction, and error analysis. Methodologically, 12 token-level behavioral features are constructed based on three wash-trading patterns (Self, Matched, and Circular), unifying transaction-, address-, and flow-level signals into risk vectors. Supervised models are then employed to output warning scores and alert decisions. Under the current setting (7 tokens, 33,242 records), Random Forest outperforms Logistic Regression on core metrics, achieving AUC=0.9098, PR-AUC=0.9185, and F1=0.7429. Ablation results show that trade-level features are the primary performance driver (Delta PR-AUC=-0.1843 when removed), while address-level features provide stable complementary gain (Delta PR-AUC=-0.0573). The model also demonstrates actionable early-warning potential for a subset of samples, with a mean Lead Time (v1) of 3.8133 hours. The error profile (FP=1, FN=8) indicates that the current system is better positioned as a high-precision screener rather than a high-recall automatic alarm engine. The main contributions are threefold: an executable and reproducible rug-pull warning pipeline, empirical validation of multi-granularity wash-trading features under weak supervision, and deployment-oriented evidence through lead-time and error-bound analysis.

연구 동기 및 목표

고정된 약화된 감독하에 고빈도 BSC 밈 토큰에서 Rug-pull 위험 탐지의 도전 과제 해결.
워시 트레이딩 패턴과 실행 경고 점수로 연결되는 재현 가능한 파이프라인 개발.
예측 성능을 좌우하는 특성 그룹(거래 수준, 주소 수준, 계약 수준)을 식별.
리드 타임 및 경계 오류와 같은 배포 지향 메트릭을 제공하여 위험 선별을 돕기.

제안 방법

세 가지 워시 트레이딩 패턴(Self, Matched, Circular)으로부터 12개 토큰 수준 행동 특징 구성.
거래 신호, 주소 신호, 흐름 신호를 토큰 수준 위험 특징 벡터로 집계.
감독 학습 모델(Logistic Regression 및 Random Forest)을 학습시켜 경고 점수와 리드 타임 추정 출력.
정확도, 정밀도, 재현율, F1, AUC, PR-AUC, 리드 타임으로 평가하되 PR-AUC를 핵심 순위 지표로 간주.
거래 수준, 주소 수준, 계약 수준 특징의 기여를 평가하기 위한 절제 분석 수행.

Figure 1: BSC Meme Token Rug Pull Early-warning Framework. The overall pipeline contains four stages (E1–E4): data construction and labeling, wash-trading pattern profiling, early-warning modeling, and ablation/error analysis.

실험 결과

연구 질문

RQ1다중 수준의 워시 트레이딩 특징이 약한 감독 하에서 Rug-pull에 대한 안정적인 조기 경고 신호를 제공할 수 있는가?
RQ2어떤 특징 그룹이 예측 성능에 가장 크게 기여하며 재현율과 정밀도에 어떤 영향을 미치는가?
RQ3BSC 밈 토큰에 대한 조기 경고 모델을 배포할 때 리드 타임 및 오류 프로필은 어떠한가?
RQ4비선형 모델이 온체인 위험 패턴 파악에서 선형 베이스라인에 비해 어떤 차이를 보이는가?

주요 결과

Model	Accuracy	Precision	Recall	F1	AUC	PR-AUC	Lead Time (h)
Logistic Regression	0.6500	0.7059	0.5714	0.6316	0.7243	0.7397	3.8133
Random Forest	0.7750	0.9286	0.6190	0.7429	0.9098	0.9185	3.8133

Random Forest가 핵심 지표에서 Logistic Regression보다 우수한 성능을 보였다(AUC 0.9098, PR-AUC 0.9185, F1 0.7429).
거래 수준 특징이 주요 성능 동인이며 제거 시 PR-AUC가 0.1843 감소했다.
주소 수준 특징이 안정적인 보완 이득을 제공하며 제거 시 PR-AUC가 0.0573 감소했다.
현재 계약 수준 특징은 판별력이 제한적이며 제거 시 PR-AUC가 0.0077 증가했다.
리드 타임(v1)은 일부 사례에서 평균 3.8133 시간, 중앙값은 약 1.0331 시간이다.
오류 프로필은 FP=1 및 FN=8로, 높은 정밀도 스크리닝이 더 우수한 재현율 자동화에 비해 우수함을 시사한다.

Figure 2: Data Collection and Labeling Pipeline. The workflow includes export, token-wise merging, deduplication, normalization, window capping, rule-based labeling, and quality-control checks.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.