QUICK REVIEW

[논문 리뷰] Gradient Boosting Decision Trees on Medical Diagnosis over Tabular Data

Aytaç Yıldız, Arzu Kalaycı|arXiv (Cornell University)|2024. 09. 25.

Artificial Intelligence in Healthcare인용 수 6

한 줄 요약

이 논문은 Gradient Boosting Decision Trees (LightGBM, XGBoost, CatBoost)가 전통 ML 및 표 형식 DL 모델을 7개 의료 표형 데이터셋에서 우수하게 수행하며, 훈련 시간이 우호적임을 경험적으로 보여준다.

ABSTRACT

Medical diagnosis is a crucial task in the medical field, in terms of providing accurate classification and respective treatments. Having near-precise decisions based on correct diagnosis can affect a patient's life itself, and may extremely result in a catastrophe if not classified correctly. Several traditional machine learning (ML), such as support vector machines (SVMs) and logistic regression, and state-of-the-art tabular deep learning (DL) methods, including TabNet and TabTransformer, have been proposed and used over tabular medical datasets. Additionally, due to the superior performances, lower computational costs, and easier optimization over different tasks, ensemble methods have been used in the field more recently. They offer a powerful alternative in terms of providing successful medical decision-making processes in several diagnosis tasks. In this study, we investigated the benefits of ensemble methods, especially the Gradient Boosting Decision Tree (GBDT) algorithms in medical classification tasks over tabular data, focusing on XGBoost, CatBoost, and LightGBM. The experiments demonstrate that GBDT methods outperform traditional ML and deep neural network architectures and have the highest average rank over several benchmark tabular medical diagnosis datasets. Furthermore, they require much less computational power compared to DL models, creating the optimal methodology in terms of high performance and lower complexity.

연구 동기 및 목표

다양한 표 형식 의료 진단 데이터셋에서 GBDT 모델(XGBoost, LightGBM, CatBoost)의 성능을 평가한다.
GBDT를 전통 ML 및 최첨단 표 형식 DL 모델과 비교한다.
실용적 임상 배치를 위한 학습 시간과 성능의 trade-off를 분석한다.
데이터셋 규모와 특성에 따른 의료 표 형식 데이터에 대한 모델 선택에 대한 가이드를 제공한다.

제안 방법

범주형에 대한 서수 인코딩으로 데이터 전처리하고 수치 특성은 표준화한다.
ROC AUC를 지표로 5개의 전통 ML 모델, 5개의 DL 모델, 4개의 앙상블 모델(3개의 GBDT)을 평가한다.
일반화 능력을 평가하기 위해 8배 계층적 교차검증을 수행한다.
하이퍼파라미터 최적화: 각 모델당 평균 ROC AUC를 교차검증 fold를 기준으로 약 36개 조합을 평가한다.
성능 및 평균 훈련 시간 측면에서 모델을 비교한다.

실험 결과

연구 질문

RQ1GBDT 모델이 다양한 의료 데이터셋에서 전통 ML 및 표 형식 DL 모델보다 더 높은 ROC AUC를 달성하는가?
RQ2어떤 GBDT 구현(XGBoost, LightGBM, CatBoost)이 성능과 훈련 시간 간의 최적의 균형을 제공하는가?
RQ3의료 표 형식 데이터에서 데이터셋의 크기와 특징 차원수에 따라 모델 성능은 어떻게 변화하는가?
RQ4정확도와 효율성을 바탕으로 임상 의사결정 지원에서의 모델 선택에 실용적 시사점은 무엇인가?

주요 결과

모델	CD	심부전	파킨슨병	EEG 눈 상태	눈 움직임	Arcene	전립선	Avg. Rank
SVM	78.715 ± 0.005	86.389 ± 0.048	88.791 ± 0.068	70.752 ± 0.013	78.405 ± 0.007	87.094 ± 0.043	91.419 ± 0.096	9.857
로지스틱 회귀	78.435 ± 0.005	87.571 ± 0.051	90.875 ± 0.041	61.125 ± 0.014	71.180 ± 0.009	95.211 ± 0.031	95.089 ± 0.065	8.143
KNN	69.611 ± 0.006	77.529 ± 0.067	96.857 ± 0.023	91.185 ± 0.005	72.448 ± 0.009	90.869 ± 0.065	87.822 ± 0.112	9.857
Random Forest	77.464 ± 0.005	91.233 ± 0.038	96.068 ± 0.033	98.404 ± 0.002	87.234 ± 0.007	91.153 ± 0.034	93.155 ± 0.078	6.000
결정 트리	63.325 ± 0.006	71.646 ± 0.051	81.287 ± 0.060	83.781 ± 0.008	70.951 ± 0.009	72.037 ± 0.116	80.357 ± 0.106	12.714
LDA	70.363 ± 0.005	87.896 ± 0.053	88.609 ± 0.060	67.130 ± 0.014	71.273 ± 0.010	69.927 ± 0.124	93.849 ± 0.060	10.571
MLP [60]	80.090 ± 0.005	87.288 ± 0.056	97.186 ± 0.022	95.513 ± 0.006	73.397 ± 0.015	93.669 ± 0.042	89.881 ± 0.108	6.429
STG [37]	79.667 ± 0.004	86.241 ± 0.058	95.352 ± 0.038	84.854 ± 0.011	80.780 ± 0.006	90.584 ± 0.062	94.048 ± 0.094	7.857
TabNet [9]	77.757 ± 0.004	93.319 ± 0.037	99.446 ± 0.012	62.441 ± 0.040	87.673 ± 0.008	87.662 ± 0.098	66.865 ± 0.205	7.429
TabTransformer [36]	71.327 ± 0.123	87.642 ± 0.069	96.625 ± 0.027	79.646 ± 0.039	70.534 ± 0.010	94.724 ± 0.051	92.956 ± 0.107	8.571
VIME [38]	78.882 ± 0.004	85.758 ± 0.047	98.532 ± 0.016	92.473 ± 0.005	81.918 ± 0.008	91.721 ± 0.070	52.679 ± 0.164	7.429
XGBoost [49]	79.745 ± 0.004	90.478 ± 0.025	97.265 ± 0.023	98.331 ± 0.002	89.675 ± 0.008	89.123 ± 0.047	94.940 ± 0.055	4.429
LightGBM [50]	80.296 ± 0.004	91.490 ± 0.027	98.623 ± 0.015	97.008 ± 0.004	89.059 ± 0.007	91.883 ± 0.043	95.486 ± 0.052	2.571
CatBoost [51]	80.378 ± 0.004	91.056 ± 0.034	97.740 ± 0.014	97.739 ± 0.003	88.954 ± 0.006	91.396 ± 0.040	96.379 ± 0.053	3.143

GBDT 모델은 7개 데이터셋 전반에서 전통 ML 및 최첨단 표 형식 DL 모델을 일관되게 능가한다.
LightGBM은 평가된 모델들 중 평균 ROC AUC와 훈련 시간 면에서 종종 최고 성과를 보인다.
평균적으로 GBDT는 DL 아키텍처에 비해 계산 비용이 낮으면서도 우수한 성능을 제공한다.
모델들 중 최상위 GBDT 변형은 데이터셋에 따라 다르지만 LightGBM이 자주 높은 순위를 차지하고 전반적인 지표가 강하다.
DL 모델은 모델 복잡성으로 인해 훈련 시간이 더 긴 경향이 있으며, 반면 GBDT는 정확도와 효율성 사이의 균형을 이룬다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.