QUICK REVIEW

[论文解读] Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences

Zejian Chen, Wei Zhuo|arXiv (Cornell University)|Jan 1, 2021

AI in cancer detection参考文献 55被引用 3

一句话总结

本文提出了一种用于医学体积和序列分割的自举自监督表示学习方法，仅使用有限的标注数据。通过利用切片之间的空间连续性，以及采用注意力引导预测器和前景-背景原型校准的全局到局部预测框架，该方法实现了最先进性能，在ACDC上比之前的方法高出4.5% DSC，在Prostate上高出1.7%，在CAMUS上高出2.3%，且仅需少量标注的体积。

ABSTRACT

In this work, we propose a novel straightforward method for medical volume and sequence segmentation with limited annotations. To avert laborious annotating, the recent success of self-supervised learning(SSL) motivates the pre-training on unlabeled data. Despite its success, it is still challenging to adapt typical SSL methods to volume/sequence segmentation, due to their lack of mining on local semantic discrimination and rare exploitation on volume and sequence structures. Based on the continuity between slices/frames and the common spatial layout of organs across volumes/sequences, we introduced a novel bootstrap self-supervised representation learning method by leveraging the predictable possibility of neighboring slices. At the core of our method is a simple and straightforward dense self-supervision on the predictions of local representations and a strategy of predicting locals based on global context, which enables stable and reliable supervision for both global and local representation mining among volumes. Specifically, we first proposed an asymmetric network with an attention-guided predictor to enforce distance-specific prediction and supervision on slices within and across volumes/sequences. Secondly, we introduced a novel prototype-based foreground-background calibration module to enhance representation consistency. The two parts are trained jointly on labeled and unlabeled data. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 4.5\% DSC on ACDC, 1.7\% on Prostate, and 2.3\% on CAMUS. Intensive evaluations reveals the effectiveness and superiority of our method.

研究动机与目标

解决基于深度学习的分割任务中医学体积和序列标注数据有限的挑战。
通过利用固有的空间和时间连续性，改进3D和时间序列医学数据的表示学习。
开发一种联合训练框架，防止对未标注数据的遗忘，与两阶段训练方法不同。
通过基于可预测切片转换的密集自监督机制，增强局部和全局表示学习。
通过引入语义感知的、基于原型的前景-背景校准，稳定对比学习，用于未标注数据。

提出的方法

提出一种非对称编码器-预测器网络，结合距离特定的注意力机制，利用当前切片的全局上下文预测其他切片的密集特征。
应用密集相似性损失，强制在体积内和体积间预测的一致性，实现全局和局部表示的联合挖掘。
在预测器中设计一种距离优化的注意力机制，根据切片距离（内部和跨体积）自适应加权预测结果。
提出一种基于原型的前景-背景校准模块，将未标注数据的解码器特征与标注数据的原型对齐，以提升特征一致性。
使用标注数据的前景掩码和未标注数据概率图中的高激活区域作为对比学习的语义参考。
在标注和未标注数据上端到端联合训练整个模型，同时优化预测和校准目标。

实验结果

研究问题

RQ1医学体积中相邻切片之间的可预测空间连续性是否可有效用于自监督表示学习？
RQ2全局到局部预测框架如何改善3D和序列医学数据中的局部和全局特征学习？
RQ3基于原型的前景-背景校准是否能提升半监督分割中的特征一致性和泛化能力？
RQ4与依赖随机裁剪或补丁级增强的对比自监督学习方法相比，所提方法在体积和序列数据中是否表现更优？
RQ5该方法在不同主干网络架构和不同标注预算的数据集上是否具有良好泛化能力？

主要发现

在仅使用2个体素标注的情况下，该方法在ACDC数据集上的Dice相似度系数（DSC）比之前最先进方法（CGL）高出4.5%。
在Prostate数据集上，当仅使用2个体素标注时，该方法平均DSC比CGL提升1.7%，表明在复杂解剖结构上表现更优。
在CAMUS超声心动图数据集上，仅使用8条标注序列时，该方法比CGL高出2.3% DSC，且相比全监督基准减少了89.3%的标注数据。
该方法在不同编码器主干网络上泛化良好，在VGG13和ResNet18架构上均优于随机初始化和CGL方法。
在Prostate数据集上，使用8个体素标注的该方法与全监督基准（20个体素）之间的DSC差距仅为0.7%，表明其具有极强的数据效率。
消融研究证实，注意力引导的预测器和前景-背景校准模块对最优性能均至关重要，每个组件均对最终性能提升有显著贡献。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。