QUICK REVIEW

[论文解读] Embracing Imperfect Datasets: A Review of Deep Learning Solutions for Medical Image Segmentation

Nima Tajbakhsh, Laura Jeyaseelan|arXiv (Cornell University)|Aug 27, 2019

Advanced Neural Network Applications参考文献 171被引用 331

一句话总结

一篇全面综述，系统评估在医疗影像分割中解决稀缺与标注薄弱问题的方法，涵盖数据增强、外部数据使用、领域自适应和鲁棒学习策略。

ABSTRACT

The medical imaging literature has witnessed remarkable progress in high-performing segmentation models based on convolutional neural networks. Despite the new performance highs, the recent advanced segmentation models still require large, representative, and high quality annotated datasets. However, rarely do we have a perfect training dataset, particularly in the field of medical imaging, where data and annotations are both expensive to acquire. Recently, a large body of research has studied the problem of medical image segmentation with imperfect datasets, tackling two major dataset limitations: scarce annotations where only limited annotated data is available for training, and weak annotations where the training data has only sparse annotations, noisy annotations, or image-level annotations. In this article, we provide a detailed review of the solutions above, summarizing both the technical novelties and empirical results. We further compare the benefits and requirements of the surveyed methodologies and provide our recommended solutions. We hope this survey article increases the community awareness of the techniques that are available to handle imperfect medical image segmentation datasets.

研究动机与目标

通过强调在不完美的医疗数据集下实现有效分割的需求来激发本研究的动机。
将数据集限制分为稀缺标注和弱标注并总结相应的解决方案。
调查以数据为中心和以学习为中心的技术，并给出经验结果以指导实践。
提供成本-收益视角以推荐针对不完美数据集的实用解决方案。

提出的方法

将方法分为稀缺标注与弱标注在医疗影像分割中的分类与评审。
总结数据增强、外部标注数据的利用，以及基于CRF的稀缺性细化。
总结迁移学习、领域自适应和数据集融合以利用外部数据。
讨论弱标注的学习策略，包括类激活图、多实例学习和鲁棒损失形式。
比较方法在性能、实现难度和数据需求上的差异，并给出建议。

实验结果

研究问题

RQ1在医疗影像分割中处理稀缺标注的主要技术有哪些？
RQ2如何有效地利用弱标注（稀疏、噪声或图像级标签）进行分割？
RQ3在不完美数据集中，数据为中心的增强、外部数据利用和正则化技术之间存在哪些权衡？
RQ4领域自适应与多域学习如何提高临床影像场景中的泛化能力？

主要发现

数据增强，包括传统的、mixup 和合成策略，有助于在标注有限的情况下缓解过拟合。
通过迁移学习、领域自适应和数据集融合获得的外部标注数据，在目标数据稀缺时可以改进分割。
基于 CRF 的后处理和正则化方法在不需要额外标注的情况下提高分割质量。
使用 GAN、CycleGAN 和 MUNIT 的领域自适应解决扫描仪、模态和患者群体之间的分布偏移。
弱监督技术如类激活图和选择性损失使从图像级或稀疏标注中学习成为可能。
该综述提供一个成本-收益框架，帮助根据数据可用性和资源来选择方法。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。