QUICK REVIEW

[論文レビュー] Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective

Shengjia Chen, Gabriele Campanella|arXiv (Cornell University)|Jul 10, 2024

AI in cancer detection被引用数 6

ひとこと要約

この論文は、ドメイン固有およびImageNet pretrained embeddingsを用いて、臨床病理学の9つの臨床的に関連するタスクにわたって、スライドレベルの埋め込み集約手法を10件ベンチマークし、性能と一般化可能性を評価する。ドメイン固有の埋め込みは一般にImageNetを上回るが、すべてのタスクを支配する単一の手法はなく、空間情報を組み込むことによる利得は限定的であることを強調する。

ABSTRACT

Recent advances in artificial intelligence (AI), in particular self-supervised learning of foundation models (FMs), are revolutionizing medical imaging and computational pathology (CPath). A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation. Due to the prevalent use of datasets created for genomic research, such as TCGA, for method development, the performance of these techniques on diagnostic slides from clinical practice has been inadequately explored. This study conducts a thorough benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks, including diagnostic assessment, biomarker classification, and outcome prediction. The results yield following key insights: (1) Embeddings derived from domain-specific (histological images) FMs outperform those from generic ImageNet-based models across aggregation methods. (2) Spatial-aware aggregators enhance the performance significantly when using ImageNet pre-trained models but not when using FMs. (3) No single model excels in all tasks and spatially-aware models do not show general superiority as it would be expected. These findings underscore the need for more adaptable and universally applicable aggregation techniques, guiding future research towards tools that better meet the evolving needs of clinical-AI in pathology. The code used in this work is available at \url{https://github.com/fuchs-lab-public/CPath_SABenchmark}.

研究の動機と目的

臨床病理（CPath）における9つの臨床的に関連するタスクにわたり、10件のスライドレベル集約手法を評価する。
埋め込みの出所（ドメイン固有 vs. ImageNet）が集約性能に与える影響を評価する。
手法を比較し、空間情報が有効となる状況を強調することで、実践的なガイドラインを提供する。
病理学における普遍的に適用可能な集約技術の今後の開発を導く知見を提供する。

提案手法

WSIをMILバッグとして、タイルをインスタンスとし、埋め込みを h_i = f(x_i) とする。
各タイルについて任意の空間情報 s_i を考慮する。
4つのFM からの埋め込みを用いて、9タスクにわたり10個の集約手法を評価する。
頑健性を評価するため、それぞれのタスクにつき20フォールドのモンテカルロ交差検証を使用する。
単一のA100 GPU上で、固定の40エポックスケジュールでAdamWを用いて訓練する。
AUCを用いて性能を比較し、AB-MILをベースラインとする片側t検定を実施する。

実験結果

リサーチクエスチョン

RQ1埋め込みの出所（ドメイン固有 vs. ImageNet）は、タスクを横断して集約性能に影響を与えるか。
RQ2診断、バイオマーカー、アウトカム予測タスクにおいて、どのスライドレベル集約手法が最も性能を発揮するか。
RQ3単一の優れた集約手法が存在するのか、それともタスク依存の差が優位か。
RQ4異なる埋め込みに対して空間情報を組み込むことが性能にどのように影響するか。

主な発見

ドメイン固有の埋め込み（CTransPath、dinosmall、UNI）は、ほとんどのタスクでImageNetベースの埋め込みを上回る。
空間情報を考慮した集約は、ImageNet事前学習モデルでは性能を向上させるが、ファウンデーションモデルでは一貫して向上させない。
単一の集約法がすべてのタスクを支配することはなく、タスクと埋め込みによって性能は変動する。
AB-MILは依然として強力なベースラインであり、いくつかの手法は特定のタスクまたは埋め込みでのみAB-MILを上回る。
ドメイン固有の埋め込みはボックスプロット全体で分散が小さく、より安定した性能を示唆している。
ソース間で公開データセットを比較すると性能のばらつきが存在し、一般化の課題を強調している。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。