[논문 리뷰] RadGPT: Constructing 3D Image-Text Tumor Datasets
RadGPT는 AbdomenAtlas 3.0을 생성합니다. 이는 voxel 단위의 종양 주석과 보고서를 포함한 대규모 3D 복부 CT 이미지-텍스트 데이터셋이며, CT 스캔으로부터 구조화된, 서술형, 및 융합 보고서를 생성하는 해부학 인식 비전-언어 에이전트를 제시합니다.
With over 85 million CT scans performed annually in the United States, creating tumor-related reports is a challenging and time-consuming task for radiologists. To address this need, we present RadGPT, an Anatomy-Aware Vision-Language AI Agent for generating detailed reports from CT scans. RadGPT first segments tumors, including benign cysts and malignant tumors, and their surrounding anatomical structures, then transforms this information into both structured reports and narrative reports. These reports provide tumor size, shape, location, attenuation, volume, and interactions with surrounding blood vessels and organs. Extensive evaluation on unseen hospitals shows that RadGPT can produce accurate reports, with high sensitivity/specificity for small tumor (<2 cm) detection: 80/73% for liver tumors, 92/78% for kidney tumors, and 77/77% for pancreatic tumors. For large tumors, sensitivity ranges from 89% to 97%. The results significantly surpass the state-of-the-art in abdominal CT report generation. RadGPT generated reports for 17 public datasets. Through radiologist review and refinement, we have ensured the reports' accuracy, and created the first publicly available image-text 3D medical dataset, comprising over 1.8 million text tokens and 2.7 million images from 9,262 CT scans, including 2,947 tumor scans/reports of 8,562 tumor instances. Our reports can: (1) localize tumors in eight liver sub-segments and three pancreatic sub-segments annotated per-voxel; (2) determine pancreatic tumor stage (T1-T4) in 260 reports; and (3) present individual analyses of multiple tumors--rare in human-made reports. Importantly, 948 of the reports are for early-stage tumors.
연구 동기 및 목표
- publicly available abdominal CT datasets with per-voxel tumor annotations and real radiology reports를 해결하다.
- Develop RadGPT, an anatomy-aware vision-language agent, to generate detailed structured and narrative reports from CT scans.
- Create AbdomenAtlas 3.0, the first public dataset with per-voxel tumor annotations, organ sub-segmentation, and pancreatic cancer staging in 3D CTs.
- Enable automated report generation that aligns with radiologist templates and institutional styles, with diagnostic evaluation metrics.
- Provide benchmarks and a framework for tumor localization, measurement, staging, and fusion of structured and human-made reports.
제안 방법
- Stage I: Segmentation of tumors and 26 anatomical structures using DiffTumor and nnU-Net with radiologist-driven refinement.
- Stage II: Structured report generation via deterministic rule-based algorithms that fill radiologist templates using segmentations and derived measurements (size, volume, attenuation).
- Stage III: Style adaptation of structured reports to target institutions' narrative styles through in-context learning with a target-hospital prompt set.
- Fusion reporting by prompting a zero-shot LLM to combine structured reports with clinical notes into comprehensive fusion reports.
- Diagnostic evaluation of AI-made reports using an LLM to extract presence/absence of tumors and compute sensitivity/specificity, enabling clinically meaningful assessment.
- Pancreatic cancer staging enabled by measuring tumor interactions with vessels (SMA, CHA, CA, portal vein) and deriving T-stages via deterministic vessel–tumor analyses.
실험 결과
연구 질문
- RQ1RadGPT가 per-voxel 복부 CT 종양 주석으로부터 정확하고 기관에 맞춘 구조화된 및 서술형 보고서를 생성할 수 있는가?
- RQ2분할 기반 보고서 생성 방식이 종양 탐지 및 병기에서 엔드-투-엔드 복부 CT 보고서 모델보다 우수한가?
- RQ3텍스트 유사성 외의 임상적으로 의미 있는 지표로 AI가 만든 방사선과를 어떻게 평가할 수 있는가?
- RQ4AbdomenAtlas 3.0이 퍼-볼체 채널의 췌장 세부분절, 주변 췌장 혈관, 및 PDAC 병기를 공개 데이터셋으로 제공함으로써 어떤 가치를 추가하는가?
주요 결과
| Model | 민감도 (%) (LiverHCC) | 특이도 (%) (LiverHCC) | 민감도 (%) (PancreasPDAC) | 특이도 (%) (PancreasPDAC) | 민감도 (%) (KidneyRCC) | 특이도 (%) (KidneyRCC) | 민감도 (%) (CRC Liver Mets) | 특이도 (%) (CRC Liver Mets) |
|---|---|---|---|---|---|---|---|---|
| CT2Rep | 0.0 (0/301) | 100.0 (244/244) | 10.0 (22/219) | 98.0 (239/244) | 3.8 (4/105) | 96.7 (236/244) | 9.4 (2/21) | 98.1 (155/158) |
| M3D | 14.6 (44/301) | 88.9 (217/244) | 9.1 (20/219) | 92.6 (226/244) | 4.8 (5/105) | 96.3 (235/244) | 12.1 (7/58) | 85.4 (135/158) |
| RadGPT (ours) | 89.4 (269/301) | 73.4 (179/244) | 97.3 (213/219) | 78.3 (191/244) | 91.4 (96/105) | 76.6 (187/244) | 100.0 (58/58) | 70.3 (111/158) |
- RadGPT는 간, 췌장, 신장 및 간 전이에서 큰 종양과 작은 종양 탐지에 대해 엔드-투-엔드 복부 CT 보고서 모델보다 우수.
- Fully automated RadGPT reports achieve higher sensitivity for tumor detection and comparable or better specificity than M3D and CT2Rep baselines.
- AbdomenAtlas 3.0 provides 9,262 CT scans with per-voxel tumor annotations across three organs and includes pancreas sub-segments and PDAC staging.
- RadGPT achieves automatic PDAC T-stage determination and provides per-voxel organ and vessel annotations to support staging.
- Radiologist evaluation shows 75.6% tumor-detection precision and 93.8% tumor-size measurement accuracy for RadGPT across evaluated cases.
더 나은 연구,지금 바로 시작하세요
연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.
카드 등록 없음 · 무료 플랜 제공
이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.