Skip to main content
QUICK REVIEW

[论文解读] Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation

Yunchao Wei, Huaxin Xiao|arXiv (Cornell University)|May 11, 2018
Advanced Neural Network Applications参考文献 40被引用 45
一句话总结

论文复用多尺度扩张卷积块,从图像级标签生成密集对象定位,实现 PASCAL VOC 2012 上弱监督和半监督语义分割的最先进方法。

ABSTRACT

Despite the remarkable progress, weakly supervised segmentation approaches are still inferior to their fully supervised counterparts. We obverse the performance gap mainly comes from their limitation on learning to produce high-quality dense object localization maps from image-level supervision. To mitigate such a gap, we revisit the dilated convolution [1] and reveal how it can be utilized in a novel way to effectively overcome this critical limitation of weakly supervised segmentation approaches. Specifically, we find that varying dilation rates can effectively enlarge the receptive fields of convolutional kernels and more importantly transfer the surrounding discriminative information to non-discriminative object regions, promoting the emergence of these regions in the object localization maps. Then, we design a generic classification network equipped with convolutional blocks of different dilated rates. It can produce dense and reliable object localization maps and effectively benefit both weakly- and semi- supervised semantic segmentation. Despite the apparent simplicity, our proposed approach obtains superior performance over state-of-the-arts. In particular, it achieves 60.8% and 67.6% mIoU scores on Pascal VOC 2012 test set in weakly- (only image-level labels are available) and semi- (1,464 segmentation masks are available) supervised settings, which are the new state-of-the-arts.

研究动机与目标

  • 激发并解决在图像级监督下用于弱监督分割的密集对象定位的空白。
  • 提出一种简单、通用的方法,使用多膨胀卷积块将辨别性知识转移到非辨别性的目标区域。
  • 使密集定位图能够在弱监督和半监督设置下提升分割训练。

提出的方法

  • 通过添加多个膨胀率块来增强标准分类网络,使感受野在多尺度上增大。
  • 对每个块使用类激活图(CAM)来创建对象定位图。
  • 提出一种抗噪融合策略:通过对膨胀率为(3,6,9)的块的定位图求平均并将结果加到 d=1 的图上。
  • 使用密集定位图作为伪掩码训练分割模型,利用显著性来提供背景线索。
  • 给出弱监督(图像级标签)和半监督(混合强标签/弱标签)场景的学习目标。

实验结果

研究问题

  • RQ1多膨胀率的扩张卷积块是否能够从图像级监督中产生密集、可靠的对象定位?
  • RQ2多膨胀定位图的抗噪融合是否在弱监督和半监督设置下提升分割性能?
  • RQ3提议的定位方法在弱监督和半监督情境下对 VOC 2012 的最新结果有何影响?

主要发现

  • 在弱监督设置下,Pascal VOC 2012 测试集达到新的 SOTA mIoU:60.8%(仅图像级标签)。
  • 在半监督设置下,Pascal VOC 2012 测试集达到新的 SOTA mIoU:67.6%。
  • 由多膨胀块生成的密集定位图,在使用抗噪策略融合后,与单一膨胀或简单平均相比,显著提升分割训练效果。
  • 该方法在弱监督 setting 下取得 60.4% 验证 mIoU 和 60.8% 测试 mIoU,在半监督实验中获得 65.7% 验证 mIoU 和 67.6% 测试 mIoU。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。