QUICK REVIEW

[论文解读] Leveraging Transformers to Improve Breast Cancer Classification and Risk Assessment with Multi-modal and Longitudinal Data

Yiqiu Shen, Jungkyu Park|arXiv (Cornell University)|Nov 6, 2023

AI in cancer detection被引用 10

一句话总结

介绍 Multi-modal Transformer (MMT)，将乳腺 X 线摄影、超声和先前影像融合，用于检测现有乳腺癌并预测 5 年癌症风险，超越单模态和先前多模态基线。

ABSTRACT

Breast cancer screening, primarily conducted through mammography, is often supplemented with ultrasound for women with dense breast tissue. However, existing deep learning models analyze each modality independently, missing opportunities to integrate information across imaging modalities and time. In this study, we present Multi-modal Transformer (MMT), a neural network that utilizes mammography and ultrasound synergistically, to identify patients who currently have cancer and estimate the risk of future cancer for patients who are currently cancer-free. MMT aggregates multi-modal data through self-attention and tracks temporal tissue changes by comparing current exams to prior imaging. Trained on 1.3 million exams, MMT achieves an AUROC of 0.943 in detecting existing cancers, surpassing strong uni-modal baselines. For 5-year risk prediction, MMT attains an AUROC of 0.826, outperforming prior mammography-based risk models. Our research highlights the value of multi-modal and longitudinal imaging in cancer diagnosis and risk stratification.

研究动机与目标

Motivate improved breast cancer screening by leveraging cross-modal and longitudinal data beyond single-modality analysis.
Develop a transformer-based framework (MMT) that fuses mammography and ultrasound with prior imaging to detect cancer and predict future risk.
Demonstrate that incorporating prior exams enhances long-term risk stratification.
Evaluate the model on a large multi-modal NYU dataset and compare against radiologist risk models and existing AI systems.

提出的方法

Detect modality-specific regions of interest (ROIs) with dedicated detectors per modality (FFDM, DBT, Ultrasound).
Project ROI features into a shared space and concatenate with categorical embeddings (date, laterality, modality, view, age).
Use a transformer encoder with a CLS token to model temporal changes and cross-modal interactions across ROIs.
Generate six non-negative risk scores for successive intervals (0, 120d, 120d-1y, 1-2y, 2-3y, 3-4y, 4-5y) via an MLP and an additive hazard layer with sigmoid activation.
Train detectors per modality first, then train the transformer and embeddings on multi-modal sequences; apply model ensembling of 100 runs selecting top 5 models.]
research_questions:[

实验结果

研究问题

RQ1Can a multi-modal transformer combining mammography, ultrasound, and prior imaging improve detection of existing breast cancer compared to uni-modal baselines?
RQ2Does integrating longitudinal prior imaging enhance long-term (5-year) breast cancer risk prediction beyond mammography-based models?
RQ3How does incorporating cross-modality information affect cancer diagnosis performance and risk stratification?
RQ4What is the contribution of prior imaging vs. current modality data to short-term vs. long-term predictions?

主要发现

模型	AUROC	AUPRC
GMIC (Shen et al., 2021b)	0.866	0.167
YOLOX (Ge et al., 2021)	0.876	0.172
MogaNet (Li et al., 2022)	0.874	0.181
Multi-modal Ensemble	0.925	0.251
MMT	0.943	0.518

MMT achieved AUROC 0.943 and AUPRC 0.518 for cancer diagnosis, outperforming all uni-modal baselines and the multi-modal ensemble.
For 5-year risk prediction, MMT attained AUROC 0.826 and AUPRC 0.524, outperforming BI-RADS and Mirai.
Ablation shows ultrasound improves diagnosis and prior imaging mainly benefits long-term risk, with diminishing gains beyond two years.
MMT demonstrates value of combining multi-modal imaging and longitudinal history for improved diagnosis and risk stratification.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。