QUICK REVIEW

[论文解读] The Rise of AI Language Pathologists: Exploring Two-level Prompt Learning for Few-shot Weakly-supervised Whole Slide Image Classification

Linhao Qu, Xiaoyuan Luo|arXiv (Cornell University)|May 29, 2023

Multimodal Machine Learning Applications被引用 13

一句话总结

引入 FSWC 和 TOP，一种基于 CLIP 与 GPT-4 的两级提示学习 MIL 框架，在弱监督下实现对 WSI 的少-shot 包级和实例级分类。

ABSTRACT

This paper introduces the novel concept of few-shot weakly supervised learning for pathology Whole Slide Image (WSI) classification, denoted as FSWC. A solution is proposed based on prompt learning and the utilization of a large language model, GPT-4. Since a WSI is too large and needs to be divided into patches for processing, WSI classification is commonly approached as a Multiple Instance Learning (MIL) problem. In this context, each WSI is considered a bag, and the obtained patches are treated as instances. The objective of FSWC is to classify both bags and instances with only a limited number of labeled bags. Unlike conventional few-shot learning problems, FSWC poses additional challenges due to its weak bag labels within the MIL framework. Drawing inspiration from the recent achievements of vision-language models (V-L models) in downstream few-shot classification tasks, we propose a two-level prompt learning MIL framework tailored for pathology, incorporating language prior knowledge. Specifically, we leverage CLIP to extract instance features for each patch, and introduce a prompt-guided pooling strategy to aggregate these instance features into a bag feature. Subsequently, we employ a small number of labeled bags to facilitate few-shot prompt learning based on the bag features. Our approach incorporates the utilization of GPT-4 in a question-and-answer mode to obtain language prior knowledge at both the instance and bag levels, which are then integrated into the instance and bag level language prompts. Additionally, a learnable component of the language prompts is trained using the available few-shot labeled data. We conduct extensive experiments on three real WSI datasets encompassing breast cancer, lung cancer, and cervical cancer, demonstrating the notable performance of the proposed method in bag and instance classification. All codes will be available.

研究动机与目标

在 MIL 下，在有限的 bag 标签条件下，激励并形式化 Few-shot Weakly Supervised WSI Classification (FSWC)。
提出一个 Two-level Prompt Learning MIL 框架（TOP），利用 GPT-4 派生的语言先验来指导实例级和包级学习。
利用 CLIP 进行实例特征提取，并使用提示引导的池化机制获得包表示。
在保持视觉-语言模型参数不变的前提下，实现在实例级和包级均进行少样本提示学习。
在多例癌症 WSI 上展示在有限标注数据下的最先进性能。

提出的方法

使用 CLIP 图像编码器在每个 WSI 包内提取补丁特征。
引入实例提示引导的池化，使用 GPT-4 生成的实例原型将实例特征聚合成包特征。
使用 GPT-4 创建描述视觉病理先验的实例级和包级提示，并加入一个可学习的提示组件（类似 CoOp）以便适应。
构建一个包级提示组，用于引导包级的少样本提示学习，并与包特征进行匹配。
用包标签的交叉熵损失优化可学习的提示向量，并加入辅助损失以强制实例原型的多样性。
推理阶段，通过将包特征与包提示匹配来对包进行分类，并通过对实例原型相似度取平均来对实例进行分类。

实验结果

研究问题

RQ1在每个类别仅有少量标记 bag 的条件下，FSWC 能否在 MIL 下被有效解决？
RQ2利用语言先验的两级提示学习策略是否能在少样本监督下同时提升包级和实例级 WSI 分类？
RQ3GPT-4 派生的实例和包先验如何影响池化与提示学习过程？
RQ4可学习提示组件（CoOp 风格）对 FSWC 的迁移与性能有何影响？
RQ5在有限数据下，是否在乳腺癌、肺癌和子宫颈癌 WSI 上都实现稳健的提升？

主要发现

TOP 在 Camelyon 16、TCGA-Lung 和 Cervical Cancer 数据集下，在少样本设置中实现了包级和实例级分类的最新水平。
实例提示引导的池化在所有实验中持续优于注意力池化。
包级提示组在包分类方面的表现优于 CoOp 风格的提示学习。
消融研究显示提示引导池化和包级提示对性能提升至关重要，辅助损失有助于稳定性。
TOP 在 1-, 2-, 4-, 8-, 16-shot 设置下相较基线有显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。