QUICK REVIEW

[论文解读] Self-Supervision Closes the Gap Between Weak and Strong Supervision in Histology

Olivier Dehaene, Axel Camara|arXiv (Cornell University)|Dec 7, 2020

Advances in Oncology and Radiotherapy参考文献 33被引用 49

一句话总结

论文在病理切片的内域自监督特征提取器（MoCo v2）上进行训练，以替代 ImageNet 特征，显著提升弱监督病理表现并缩小 Camelyon16 上的强监督差距。

ABSTRACT

One of the biggest challenges for applying machine learning to histopathology is weak supervision: whole-slide images have billions of pixels yet often only one global label. The state of the art therefore relies on strongly-supervised model training using additional local annotations from domain experts. However, in the absence of detailed annotations, most weakly-supervised approaches depend on a frozen feature extractor pre-trained on ImageNet. We identify this as a key weakness and propose to train an in-domain feature extractor on histology images using MoCo v2, a recent self-supervised learning algorithm. Experimental results on Camelyon16 and TCGA show that the proposed extractor greatly outperforms its ImageNet counterpart. In particular, our results improve the weakly-supervised state of the art on Camelyon16 from 91.4% to 98.7% AUC, thereby closing the gap with strongly-supervised models that reach 99.3% AUC. Through these experiments, we demonstrate that feature extractors trained via self-supervised learning can act as drop-in replacements to significantly improve existing machine learning techniques in histology. Lastly, we show that the learned embedding space exhibits biologically meaningful separation of tissue structures.

研究动机与目标

由于切片级标签的局限性，激励病理学中的弱监督；识别 ImageNet 预训练为一个关键弱点；提出在域内的自监督预训练方法 MoCo v2；在 Camelyon16 和 TCGA-COAD 上展示性能提升；展示学习嵌入的生物学意义和可迁移性。
（请保留原文语义的一致性）
（如需额外条目，请告知。）

提出的方法

将整幅滑动图像切成固定大小的补丁，在选定的缩放级别；使用冻结编码器提取补丁特征；对补丁信息进行多实例学习（MIL）以汇聚至切片标签；在未标记的病理切片上使用 MoCo v2 进行域内对比学习，带对比损失进行预训练；对 MoCo v2 增加适用于病理数据的旋转和翻转增强；在三种 MIL 架构（Weldon、Chowder、DeepMIL）上对两组数据集进行评估；对比 ImageNet 预训练并报告 AUC 的提升。

实验结果

研究问题

RQ1在 MoCo v2 的域内自监督预训练下，是否可以提升弱监督病理模型相对于 ImageNet 预训练特征的表现？
RQ2改进是否在不同 MIL 架构和数据集（Camelyon16 与 TCGA-COAD）上具有泛化性？
RQ3带有自监督域内特征的弱监督性能能在多大程度接近强监督基线？
RQ4学习到的嵌入是否呈现生物学意义的聚类，并支持跨器官/肿瘤类型的迁移学习？

主要发现

MoCo v2 的域内特征在不同 MIL 模型上显著提升弱监督病理结果。
在 Camelyon16 上，弱监督性能达到 98.7% AUC，接近强监督模型的 99.3% AUC。
与 ImageNet 特征相比，使用 MoCo v2 特征时标准差显著下降，表明性能更鲁棒。
在 TCGA-COAD CMS 分类上，MoCo v2 特征相比 ImageNet 获得大幅 AUC 提升，且与使用标注和集成的最先进方法结果相当。
将 MoCo v2 特征从 TCGA-COAD 转移到 Camelyon16（以及反向）显示出强跨数据集性能，突显所学习表征的可迁移性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。