[논문 리뷰] Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation
GPFM은 여러 전문가 모델로부터의 통합 지식 증류로 사전학습된 일반화 가능한 병리학 기초 모델로, 39개 임상 작업 전반에서 최상위 성능(평균 순위 1.36)을 달성합니다.
Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear. To address this gap, we established a most comprehensive benchmark to evaluate the performance of off-the-shelf foundation models across six distinct clinical task types, encompassing a total of 72 specific tasks, including slide-level classification, survival prediction, ROI-tissue classification, ROI retrieval, visual question answering, and report generation. Our findings reveal that existing foundation models excel at certain task types but struggle to effectively handle the full breadth of clinical tasks. To improve the generalization of pathology foundation models, we propose a unified knowledge distillation framework consisting of both expert and self-knowledge distillation, where the former allows the model to learn from the knowledge of multiple expert models, while the latter leverages self-distillation to enable image representation learning via local-global alignment. Based on this framework, we curated a dataset of 96,000 whole slide images (WSIs) and developed a Generalizable Pathology Foundation Model (GPFM). This advanced model was trained on a substantial dataset comprising 190 million images extracted from approximately 72,000 publicly available slides, encompassing 34 major tissue types. Evaluated on the established benchmark, GPFM achieves an impressive average rank of 1.6, with 42 tasks ranked 1st, while the second-best model, UNI, attains an average rank of 3.7, with only 6 tasks ranked 1st.
연구 동기 및 목표
- 다양한 병리학 작업 전반에 걸쳐 일반화 가능한 기초 모델의 필요성을 제시한다.
- 6개 임상 유형에 걸친 39개 작업에서 시판 기초 병리학 모델을 평가하는 포괄적 벤치마크를 작성한다.
- 전문가 지식 증류와 자기 증류를 결합한 통합 지식 증류 프레임워크를 제안하여 일반화를 개선한다.
- 일반화 가능한 병리학 기초 모델(GPFM)을 크고 다양한 WSIs 데이터셋에서 사전 학습시켜 일반화를 테스트한다.
제안 방법
- 전문가 지식 증류와 자기 증류를 결합한 통합 지식 증류 프레임워크를 도입한다.
- Mask Image Modeling (MIM) 및 EMA 기반 매개변수 업데이트로 GPFM을 사전 학습한다.
- 34개 조직 유형에 걸친 86,104 WSIs로부터 190 million 이미지의 대규모 다출처 데이터세트를 구성한다.
- WSI 분류, 생존 분석, ROI 조직 분류, 이미지 검색, VQA 및 보고서 생성을 포괄하는 포괄적 벤치마크에서 평가한다.
- 기존 기초 모델(예: UNI, Phikon, CONCH, Ctranspath)과 랭킹 기반統계 분석(Wilcoxon 검정, Nemenyi 검정)을 사용하여 비교한다.
실험 결과
연구 질문
- RQ1Can a pathology foundation model generalize across a wide range of tasks using unified knowledge distillation?
- RQ2How does GPFM perform relative to existing models across 39 diverse CPath tasks?
- RQ3What is the impact of Expert Knowledge Distillation on downstream task performance?
- RQ4Does distilling from expert models improve cross-task robustness and generalization?
- RQ5How does GPFM perform on external validation datasets and across different task categories (WSI classification, survival, ROI classification, retrieval, VQA, report generation)?
주요 결과
- GPFM achieves an average rank of 1.36 across 39 tasks, with 29 tasks ranked 1st.
- The second-best model (UNI) has an average rank of 2.96 and 4 tasks ranked 1st.
- Wilcoxon tests show GPFM significantly outperforms other models (p < 0.001).
- GPFM attains best average AUC in WSI classification tasks (0.956) and best balanced accuracy (0.833) and weighted F1 (0.834).
- In ROI classification, GPFM achieves the best average AUC (0.955) and leads in multiple datasets; external validation shows GPFM with average rank 1.5 over three datasets.
- GPFM shows strong performance in gene mutation prediction (e.g., LUAD-TP53, AUC 0.855; Glioma IDH1, AUC 0.998).
- An ablation study demonstrates that removing Expert Knowledge Distillation reduces AUC by 0.6%, weighted F1 by 1.8%, and balanced accuracy by 1.8% on average.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.