Skip to main content
QUICK REVIEW

[论文解读] Polyp Segmentation Using Wavelet-Based Cross-Band Integration for Enhanced Boundary Representation

Haesung Oh, Jaesung Lee|arXiv (Cornell University)|Mar 4, 2026
Colorectal Cancer Screening and Detection被引用 0
一句话总结

一个通过波形对齐的跨波段交互,将灰度和RGB特征融合的双编码器息肉分割模型,以在四个基准数据集上提升边界精度。

ABSTRACT

Accurate polyp segmentation is essential for early colorectal cancer detection, yet achieving reliable boundary localization remains challenging due to low mucosal contrast, uneven illumination, and color similarity between polyps and surrounding tissue. Conventional methods relying solely on RGB information often struggle to delineate precise boundaries due to weak contrast and ambiguous structures between polyps and surrounding mucosa. To establish a quantitative foundation for this limitation, we analyzed polyp-background contrast in the wavelet domain, revealing that grayscale representations consistently preserve higher boundary contrast than RGB images across all frequency bands. This finding suggests that boundary cues are more distinctly represented in the grayscale domain than in the color domain. Motivated by this finding, we propose a segmentation model that integrates grayscale and RGB representations through complementary frequency-consistent interaction, enhancing boundary precision while preserving structural coherence. Extensive experiments on four benchmark datasets demonstrate that the proposed approach achieves superior boundary precision and robustness compared to conventional models.

研究动机与目标

  • 在低对比度和光照变化下,推动对息肉分割中鲁棒边界界定的需求。
  • 在小波域内比较灰度表示与RGB表示的边界线索。
  • 提出一种通过频率一致性交互融合灰度与RGB特征的双编码器结构。
  • 证明基于灰度的边界线索可以细化RGB结构,从而提升分割准确性。

提出的方法

  • 使用两个基于Res2Net的编码器提取RGB与灰度特征。
  • 引入Band-Specific Window Cross-Attention (BS-WCA)模块,在相应的小波子带进行频率对齐的跨模态交互。
  • 加入Cascade Dilated Fusion (CDF)块,以含扩张卷积的多尺度特征进行融合。
  • 在四个数据集(Kvasir-SEG, ClinicDB, ColonDB, ETIS)上训练与评估模型,使用Dice和IoU指标。
  • 提供基于PyTorch的实现设置与可重复性细节。
Figure 1: Structural contrast comparison between RGB and grayscale images in the wavelet domain, showing consistently higher contrast for grayscale across all detail sub-bands.
Figure 1: Structural contrast comparison between RGB and grayscale images in the wavelet domain, showing consistently higher contrast for grayscale across all detail sub-bands.

实验结果

研究问题

  • RQ1通过波形对齐的跨波段交互,将灰度边界线索与RGB特征结合,是否能提升息肉分割的边界精度?
  • RQ2所提出的BS-WCA与CDF设计如何影响边界准确性和跨数据集的整体分割一致性?
  • RQ3相较于仅RGB的基线,边界感知的改进是否对数据集变异性(大小、光照、对比度)具鲁棒性?

主要发现

MethodsKvasir mDiceKvasir mIoUClinicDB mDiceClinicDB mIoUColonDB mDiceColonDB mIoUETIS mDiceETIS mIoU
Ours0.885 ± 0.0210.822 ± 0.0190.926 ± 0.0140.862 ± 0.0230.913 ± 0.0210.840 ± 0.0420.922 ± 0.0290.821 ± 0.029
  • 所提出的方法在四个数据集上均实现更高的平均Dice和IoU,相较若干基线。
  • 灰度特征在小波域提供更强的边界对比,有助于边界细化。
  • 频率一致的交互使高频的灰度细节能够细化RGB推导的结构。
  • 在大小和成像条件各异的数据集上,方法表现出稳健的性能提升。
  • 实验采用双编码器架构,结合BS-WCA与CDF,展示了边界表示的改进。
Figure 2: Proposed wavelet-based cross-band integration framework that fuses frequency-consistent information from RGB and grayscale features for enhanced boundary representation.
Figure 2: Proposed wavelet-based cross-band integration framework that fuses frequency-consistent information from RGB and grayscale features for enhanced boundary representation.

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。