QUICK REVIEW

[논문 리뷰] Longitudinal Risk Prediction in Mammography with Privileged History Distillation

Banafsheh Karimian, Alexis Guichemerre|arXiv (Cornell University)|2026. 03. 16.

AI in cancer detection인용 수 0

한 줄 요약

논문은 Privileged History Distillation(PHD)을 도입하여 훈련 중 horizon-specific 교사들을 통해 전체 히스토리 신호를 증류함으로써 추론 시점에는 현재 유방촬영술만으로도 horizon-aware의 장기 예측을 가능하게 한다.

ABSTRACT

Breast cancer remains a leading cause of cancer-related mortality worldwide. Longitudinal mammography risk prediction models improve multi-year breast cancer risk prediction based on prior screening exams. However, in real-world clinical practice, longitudinal histories are often incomplete, irregular, or unavailable due to missed screenings, first-time examinations, heterogeneous acquisition schedules, or archival constraints. The absence of prior exams degrades the performance of longitudinal risk models and limits their practical applicability. While substantial longitudinal history is available during training, prior exams are commonly absent at test time. In this paper, we address missing history at inference time and propose a longitudinal risk prediction method that uses mammography history as privileged information during training and distills its prognostic value into a student model that only requires the current exam at inference time. The key idea is a privileged multi-teacher distillation scheme with horizon-specific teachers: each teacher is trained on the full longitudinal history to specialize in one prediction horizon, while the student receives only a reconstructed history derived from the current exam. This allows the student to inherit horizon-dependent longitudinal risk cues without requiring prior screening exams at deployment. Our new Privileged History Distillation (PHD) method is validated on a large longitudinal mammography dataset with multi-year cancer outcomes, CSAW-CC, comparing full-history and no-history baselines to their distilled counterparts. Using time-dependent AUC across horizons, our privileged history distillation method markedly improves the performance of long-horizon prediction over no-history models and is comparable to that of full-history models, while using only the current exam at inference time.

연구 동기 및 목표

배치에서 screenings의 누락이나 최초 검사로 인해 deployment 시 longitudinal history가 사용 가능하지 않은 격차를 해소한다.
전체 longitudinal history를 privileged information으로 활용하는 학습 프레임워크를 개발한다.
reconstructed history를 가진 현재 검사만으로 다년 위험을 예측하는 Student 모델을 생성한다.
다중 교사 Distillation을 통해 horizon-specific 장기 위험 신호 전달을 가능하게 한다.
CSAW-CC 데이터셋에서 no-history 추론하에서의 성능 향상을 보여준다.

제안 방법

각 유방촬영술을 고정된 Mirai 기반 이미지 인코더를 사용해 visit embeddings로 인코딩한다.
History prediction 모듈을 통해 현재 검사에서 누락된 historical 임베딩을 예측한다.
sequence(current + reconstructed history)를 longitudinal encoder와 additive hazard head로 집계해 다년 위험을 예측한다.
전체 히스토리에서 horizon-specific teacher experts를 학습시키고 reconstruction된 history에서 학생으로 logits를 증류한다.
horizon-wise RCE loss와 λ_l로 제어되는 KD 기반 logit 증류를 사용한다.
Adam과 코사인 학습률 스케줄링으로 끝에서 끝까지 최적화한다.

Figure 1 : Partial AUC at 10% FPR (pAUC@10%) for LoMaR and VMRA at 4- and 5-year horizons as a function of available screening history.

실험 결과

연구 질문

RQ1전역 히스토리에서 학습된 longitudinal risk signals가 추론 시 현재 검사만 사용하는 모델로 전달될 수 있는가?
RQ2horizon-specific teacher stumps가 no-history 추론에서 장기 위험 예측을 개선하는가?
RQ3Privileged History Distillation이 1–5년 horizon에서 전체 히스토리 및 no-history 베이스라인에 비해 성능에 어떤 영향을 미치는가?

주요 결과

모델	#H	1y AUC	2y AUC	3y AUC	4y AUC	5y AUC	1y pAUC	2y pAUC	3y pAUC	4y pAUC	5y pAUC
LoMaR	4	0.914 ±0.023	0.865 ±0.020	0.851 ±0.017	0.841 ±0.019	0.851 ±0.016	0.817 ±0.023	0.749 ±0.020	0.738 ±0.018	0.731 ±0.018	0.740 ±0.018
VMRA	4	0.920 ±0.019	0.868 ±0.020	0.851 ±0.017	0.842 ±0.017	0.851 ±0.017	0.822 ±0.020	0.752 ±0.020	0.736 ±0.018	0.728 ±0.019	0.745 ±0.021
Mirai	0	0.924 ±0.020	0.872 ±0.016	0.853 ±0.015	0.837 ±0.014	0.829 ±0.015	0.824 ±0.023	0.753 ±0.019	0.735 ±0.018	0.715 ±0.018	0.711 ±0.021
LoMaR+PHD	0	0.913 ±0.022	0.865 ±0.019	0.852 ±0.016	0.845 ±0.015	0.853 ±0.015	0.810 ±0.031	0.744 ±0.024	0.734 ±0.015	0.735 ±0.020	0.752 ±0.019
VMRA+PHD	0	0.920 ±0.018	0.869 ±0.018	0.852 ±0.016	0.847 ±0.015	0.855 ±0.017	0.818 ±0.020	0.749 ±0.017	0.733 ±0.017	0.734 ±0.017	0.757 ±0.018

PHD 기반 모델(LoMaR+PHD 및 VMRA+PHD)은 히스토리가 사용 불가능할 때 성능 저하를 완화하고 전체 히스토리 모델과 근접하거나 일치한다.
각 horizon에 걸쳐 증류 모델은 더 긴 horizon(4–5년)에서 특히 낮은 FPR 영역에서 더 큰 이점을 보인다.
다중 교사 증류(5명의 교사)가 horizon-specific 이득을 가장 크게 주며, 특히 5년 horizon에서 그렇다.
no-history 베이스라인과 비교할 때, VMRA+PHD 및 LoMaR+PHD가 4–5년 예측에서 더 높은 full-AUC 및 pAUC를 달성한다.
초기 분석에 따르면 horizon-aligned distillation이 중요하며, KD를 제거하거나 교사 수를 줄이면 이득이 감소한다.

Figure 2 : Proposed PHD method for longitudinal risk prediction in mammography. Visit embeddings are extracted from each exam (mammogram), and missing historical embeddings are predicted from the current exam. The generated sequence is aggregated by a longitudinal model and passed to an additive haz

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.