QUICK REVIEW

[論文レビュー] The Rise of AI Language Pathologists: Exploring Two-level Prompt Learning for Few-shot Weakly-supervised Whole Slide Image Classification

Linhao Qu, Xiaoyuan Luo|arXiv (Cornell University)|May 29, 2023

Multimodal Machine Learning Applications被引用数 13

ひとこと要約

FSWCとTOPを導入する、CLIPとGPT-4を用いた二レベルプロンプト学習 MIL フレームワークで、弱い監督の下で少数ショットのバッグレベルおよびインスタンスレベルWSI分類を達成。

ABSTRACT

This paper introduces the novel concept of few-shot weakly supervised learning for pathology Whole Slide Image (WSI) classification, denoted as FSWC. A solution is proposed based on prompt learning and the utilization of a large language model, GPT-4. Since a WSI is too large and needs to be divided into patches for processing, WSI classification is commonly approached as a Multiple Instance Learning (MIL) problem. In this context, each WSI is considered a bag, and the obtained patches are treated as instances. The objective of FSWC is to classify both bags and instances with only a limited number of labeled bags. Unlike conventional few-shot learning problems, FSWC poses additional challenges due to its weak bag labels within the MIL framework. Drawing inspiration from the recent achievements of vision-language models (V-L models) in downstream few-shot classification tasks, we propose a two-level prompt learning MIL framework tailored for pathology, incorporating language prior knowledge. Specifically, we leverage CLIP to extract instance features for each patch, and introduce a prompt-guided pooling strategy to aggregate these instance features into a bag feature. Subsequently, we employ a small number of labeled bags to facilitate few-shot prompt learning based on the bag features. Our approach incorporates the utilization of GPT-4 in a question-and-answer mode to obtain language prior knowledge at both the instance and bag levels, which are then integrated into the instance and bag level language prompts. Additionally, a learnable component of the language prompts is trained using the available few-shot labeled data. We conduct extensive experiments on three real WSI datasets encompassing breast cancer, lung cancer, and cervical cancer, demonstrating the notable performance of the proposed method in bag and instance classification. All codes will be available.

研究の動機と目的

限られたバッグラベルの下でMILにおけるFew-shot Weakly Supervised WSI Classification (FSWC)を動機づけ、形式化する。
GPT-4由来の言語事前知識を用いてインスタンスレベルとバッグレベルの学習を導く、Two-level Prompt Learning MILフレームワーク（TOP）を提案する。
インスタンス特徴抽出にはCLIPを活用し、バッグ表現にはプロンプト誘導プーリング機構を用いる。
ビジュアル-言語(V-L)モデルのパラメータを保持しつつ、インスタンスレベルとバッグレベルの双方で少数ショットのプロンプト学習を可能にする。
限定データで複数の癌WSIにおいて最先端の性能を実証する。

提案手法

各WSIバッグ内のパッチ特徴を抽出するためにCLIP画像エンコーダを使用する。
GPT-4生成のインスタンスプロトタイプを用いて、インスタンス特徴をバッグ特徴へ集約するインスタンスプロンプト誘導プーリングを導入する。
GPT-4を用いて視覚的病理事前知識を記述するインスタンスレベルおよびバッグレベルのプロンプトを作成し、適応のための学習可能なプロンプト成分（CoOp風）を取り入れる。
バッグレベルのプロンプトグループを構築してバッグレベルの少数ショットプロンプト学習を導き、バッグ特徴と一致させる。
バッグラベルの交差エントロピー損失で学習可能プロンプトベクトルを最適化し、インスタンスプロトタイプ間の多様性を促進する補助損失を追加する。
推論時にはバッグ特徴とバッグプロンプトの一致によりバッグを分類し、インスタンスはインスタンスプロトタイプへの類似度の平均で分類する。

実験結果

リサーチクエスチョン

RQ1クラスごとにごくわずかなラベル付きバッグのみで、MILの下でFSWCを効果的に解決できるか？
RQ2言語 priors を活用した二レベルのプロンプト学習戦略は、少数ショットの監督下でバッグレベルとインスタンスレベルのWSI分類の両方を改善するか？
RQ3GPT-4由来のインスタンスおよびバッグ前提は、プーリングとプロンプト学習プロセスにどう影響するか？
RQ4学習可能プロンプト成分（CoOp風）の転移とFSWCの性能に対する影響はどのようか？
RQ5限られたデータ下で乳がん、肺がん、および子宮頸がんWSIにおいて堅牢な向上はあるか？

主な発見

TOPは Camelyon 16、TCGA-Lung、Cervical Cancer データセットにおいて、少数ショット設定でバッグおよびインスタンス分類の最先端性能を達成。
インスタンスプロンプト誘導プーリングは、実験を通じて一貫してアテンションプーリングを上回る。
バッグレベルのプロンプトグループは、バッグ分類におけるCoOp風のプロンプト学習より性能を改善。
アブレーション解析は、プロンプト誘導プーリングとバッグレベルプロンプトが性能向上の要因であり、補助損失が安定性を支援することを示す。
TOPは1-shot, 2-shot, 4-shot, 8-shot, 16-shot設定でベースラインを顕著に上回す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。