QUICK REVIEW

[論文レビュー] Performance of a Deep Learning-Based Segmentation Model for Pancreatic Tumors on Public Endoscopic Ultrasound Datasets

Pankaj Kumar Gupta, Priya Mudgil|arXiv (Cornell University)|Jan 9, 2026

Pancreatic and Hepatic Oncology Research被引用数 0

ひとこと要約

Vision Transformerベースのセグメンテーションモデル（HVITBackbone4Seg）は公開EUSデータセットで訓練され、外部評価でDiceスコアは約0.65、特異度が高く、一般化可能性を示す一方、いくつかの失敗例がある。

ABSTRACT

Background: Pancreatic cancer is one of the most aggressive cancers, with poor survival rates. Endoscopic ultrasound (EUS) is a key diagnostic modality, but its effectiveness is constrained by operator subjectivity. This study evaluates a Vision Transformer-based deep learning segmentation model for pancreatic tumors. Methods: A segmentation model using the USFM framework with a Vision Transformer backbone was trained and validated with 17,367 EUS images (from two public datasets) in 5-fold cross-validation. The model was tested on an independent dataset of 350 EUS images from another public dataset, manually segmented by radiologists. Preprocessing included grayscale conversion, cropping, and resizing to 512x512 pixels. Metrics included Dice similarity coefficient (DSC), intersection over union (IoU), sensitivity, specificity, and accuracy. Results: In 5-fold cross-validation, the model achieved a mean DSC of 0.651 +/- 0.738, IoU of 0.579 +/- 0.658, sensitivity of 69.8%, specificity of 98.8%, and accuracy of 97.5%. For the external validation set, the model achieved a DSC of 0.657 (95% CI: 0.634-0.769), IoU of 0.614 (95% CI: 0.590-0.689), sensitivity of 71.8%, and specificity of 97.7%. Results were consistent, but 9.7% of cases exhibited erroneous multiple predictions. Conclusions: The Vision Transformer-based model demonstrated strong performance for pancreatic tumor segmentation in EUS images. However, dataset heterogeneity and limited external validation highlight the need for further refinement, standardization, and prospective studies.

研究の動機と目的

内視鏡超音波(EUS)における膵腫瘍の自動化・標準化セグメンテーションを促進し、操作者間のばらつきを減らす。
大規模な公的EUSデータセット上でVision Transformerベースのセグメンテーションモデルを開発・評価する。
独立した公的データセットを用いた外部検証を通じて一般化可能性を評価する。

提案手法

2クラス分割（前景/背景）用のVision Transformerバックボーン（HVITBackbone4Seg）を用いたUSFMフレームワークの適用。
EUS画像をグレースケール変換、クロッピング、512x512ピクセルへリサイズして前処理。
AdamWオプティマイザとコサイン学習率を用い、5分割クロスバリデーションで50エポック訓練、最高のDiceスコアを早期終了の基準として選択。
Dice類似係数（DSC）、IoU、感度、特異度、精度と95%信頼区間で評価、定性的な失敗分析を報告。
外部LEPデータセットのサブセット（350枚）をテストし、最適化なしのargmax以外の後処理を行わず2値マスクを取得。

実験結果

リサーチクエスチョン

RQ1Vision Transformerベースのセグメンテーションモデルは公開EUSデータセットで膵腫瘍の頑健な境界描出を達成できるか。
RQ2モデルは独立した外部EUSデータセットへどの程度一般化できるか。
RQ3EUS画像のセグメンテーション性能における一般的な失敗モードは何か。

主な発見

5-foldクロスバリデーション時の平均DSCは0.651（95%信頼区間：0.615–0.738）。
クロスバリデーション時のIoUは0.579（95%信頼区間：0.557–0.658）。
クロスバリデーション時の特異度：98.8%；感度：69.8%；全体精度：97.5%。
外部テストセット（350枚）: DSC 0.657（95%信頼区間：0.634–0.769）；IoU 0.614（95%信頼区間：0.590–0.689）。
外部テストセット：感度 71.8%（95%信頼区間：69.1–79.3）；特異度 97.7%（95%信頼区間：95.1–99.2）。
ケースの9.7%で誤った複数予測を示し、いくつかの失敗モードを示唆。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。