Skip to main content
QUICK REVIEW

[论文解读] TransResU-Net: Transformer based ResU-Net for Real-Time Colonoscopy Polyp Segmentation

Nikhil Kumar Tomar, Annie Shergill|arXiv (Cornell University)|Jun 17, 2022
Colorectal Cancer Screening and Detection被引用 26
一句话总结

TransResU-Net 将基于 ResNet50 的编码器、 transformer 自注意力、以及扩张卷积结合起来,以实现实时息肉分割,在公开数据集上优于若干基线。

ABSTRACT

Colorectal cancer (CRC) is one of the most common causes of cancer and cancer-related mortality worldwide. Performing colon cancer screening in a timely fashion is the key to early detection. Colonoscopy is the primary modality used to diagnose colon cancer. However, the miss rate of polyps, adenomas and advanced adenomas remains significantly high. Early detection of polyps at the precancerous stage can help reduce the mortality rate and the economic burden associated with colorectal cancer. Deep learning-based computer-aided diagnosis (CADx) system may help gastroenterologists to identify polyps that may otherwise be missed, thereby improving the polyp detection rate. Additionally, CADx system could prove to be a cost-effective system that improves long-term colorectal cancer prevention. In this study, we proposed a deep learning-based architecture for automatic polyp segmentation, called Transformer ResU-Net (TransResU-Net). Our proposed architecture is built upon residual blocks with ResNet-50 as the backbone and takes the advantage of transformer self-attention mechanism as well as dilated convolution(s). Our experimental results on two publicly available polyp segmentation benchmark datasets showed that TransResU-Net obtained a highly promising dice score and a real-time speed. With high efficacy in our performance metrics, we concluded that TransResU-Net could be a strong benchmark for building a real-time polyp detection system for the early diagnosis, treatment, and prevention of colorectal cancer. The source code of the proposed TransResU-Net is publicly available at https://github.com/nikhilroxtomar/TransResUNet.

研究动机与目标

  • 促进自动化、实时的息肉分割,以助力早期结直肠癌检测。
  • 提出一种将 transformer 编码器模块与基于 ResNet50 的 Residual U-Net 和扩张卷积融合的新架构。
  • 在公开数据集(Kvasir-SEG、BKAI-IGH)上将 TransResU-Net 与多种息肉分割基线进行基准比较。
  • 展示适合潜在临床 CADx 部署的实时性能。

提出的方法

  • 使用预训练的 ResNet50 编码器的编码-解码器设计。
  • 引入 transformer 编码器块以学习长程依赖。
  • 并行的扩张卷积块,具备四个膨胀率(1、3、6、9),随后跟随 1x1 卷积以融合特征。
  • 将 transformer 与扩张特征进行级联,并随后通过两个带跳跃连接的残差解码块。
  • 最终的 1x1 卷积和 sigmoid 以生成二值分割掩模。

实验结果

研究问题

  • RQ1在保持实时速度的同时,带 Transformer 的 ResU-Net 能否提高息肉分割的准确性?
  • RQ2Transformer 与扩张卷积是否能为结肠镜息肉提供相较标准 ResU-Net 的互补提升?
  • RQ3TransResU-Net 相对于公开息肉分割数据集的已确立基线表现如何?
  • RQ4该模型是否适用于在临床环境中的实时 CADx 部署?

主要发现

  • 在 Kvasir-SEG 上,TransResU-Net 达到 DSC 0.8884,mIoU 0.8214,召回率 0.9106,精确度 0.9022,准确率 0.9651,F2 0.8971,FPS 为 48.61。
  • 在 BKAI-IGH 上,TransResU-Net 达到 DSC 0.9154,mIoU 0.8568,召回率 0.9142,精确度 0.9299,准确率 0.9938,F2 0.9129,FPS 为 42.09。
  • 消融实验显示,在 Kvasir-SEG 上移除 Transformer 和 Dilated 块会使 DSC 降低 2.05 个百分点,mIoU 降低 2.35 个百分点;完整模型在召回率/精确度方面表现更好。
  • TransResU-Net 在 Kvasir-SEG 上比 DeepLabV3+(ResNet50)多出 0.47% DSC 和 0.41% mIoU,在 BKAI-IGH 上多出 2.17% DSC 和 2.54% mIoU。
  • 定性结果表明 TransResU-Net 提供更精确的边界描绘,特别是对于微小和扁平的息肉。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。