QUICK REVIEW

[论文解读] FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation

Huikai Wu, Junge Zhang|arXiv (Cornell University)|Mar 28, 2019

Advanced Neural Network Applications参考文献 38被引用 226

一句话总结

提出 Joint Pyramid Upsampling (JPU) 来替代主干网络中重量级的扩张卷积，通过将高分辨率特征提取重新表述为联合上采样问题，从而实现更快的推理速度，并在 Pascal Context 和 ADE20K 上取得最先进的结果。

ABSTRACT

Modern approaches for semantic segmentation usually employ dilated convolutions in the backbone to extract high-resolution feature maps, which brings heavy computation complexity and memory footprint. To replace the time and memory consuming dilated convolutions, we propose a novel joint upsampling module named Joint Pyramid Upsampling (JPU) by formulating the task of extracting high-resolution feature maps into a joint upsampling problem. With the proposed JPU, our method reduces the computation complexity by more than three times without performance loss. Experiments show that JPU is superior to other upsampling modules, which can be plugged into many existing approaches to reduce computation complexity and improve performance. By replacing dilated convolutions with the proposed JPU module, our method achieves the state-of-the-art performance in Pascal Context dataset (mIoU of 53.13%) and ADE20K dataset (final score of 0.5584) while running 3 times faster.

研究动机与目标

动机：减少用于语义分割的主干网络中扩张卷积所带来的计算和内存开销。
将高分辨率特征图的提取重新表述为一个联合上采样问题。
引入并验证 Joint Pyramid Upsampling (JPU) 模块，在加速推理的同时保持精度。

提出的方法

用步幅卷积和普通卷积替代主干网络的最后两个阶段，以产生多层特征（Conv3–Conv5）。
将联合上采样表述为基于 CNN 的学习问题，以近似 DilatedFCN 的最终高分辨率特征图。
开发 JPU，其利用并行分离卷积，膨胀率为 1、2、4、8，将多级输入映射到一个联合的高分辨率特征图。
融合来自 Conv3–Conv5 的上采样特征，然后应用最终映射以生成预测，并结合全局/上下文模块（PSP/ASPP 或 Encoding）。
证明 JPU 有能力替代多种上采样模块（双线性、FPN）在多种主干网络（ResNet-50/101）中的应用。
展示在时间和内存方面实现三倍效率提升，并保持或提升准确性。

实验结果

研究问题

RQ1是否可以用一个轻量级的上采样模块替代主干网络中的扩张卷积，而不牺牲分割精度？
RQ2利用多级主干特征的联合上采样方法在精度和速度方面，与传统的双线性上采样和 FPN 相比如何？
RQ3JPU 是否能在不同的主干网络和现有上下文模块（PSP/ASPP/Encoding）上实现泛化，并在标准基准上达到最先进的结果？

主要发现

方法	骨干网	pixAcc%	mIoU%
FCN	(baseline)	71.32	29.39
SegNet	-	71.00	21.64
DilatedNet	-	73.55	32.31
CascadeNet	-	74.52	34.90
RefineNet	ResNet-152	-	40.7
PSPNet	ResNet-101	81.39	43.29
ResNet-269	-	81.69	44.94
EncNet	ResNet-101	81.69	44.65
DUpsampling	Xception-71	-	52.5
EncNet+JPU	ResNet-101	-	53.1

JPU 在关键基准上显著降低计算和内存使用量（超过 3x 更快），同时保持或提升 mIoU。
在 Pascal Context 使用 ResNet-101 时，EncNet+JPU 达到 53.1% mIoU（val 集），超过若干早前方法。
在 ADE20K 上，ResNet-50 时在 val 集达到 42.75% mIoU，ResNet-101 时在 test 集达到 0.5584 的最终分数，表明具有竞争力或最先进的性能。
用 JPU 替代扩张卷积在 EncNet、DeepLabV3（ASPP）、PSPNet 与 DeepLab 变体中，一致提升或达到相同的性能。
消融研究显示双线性上采样和 FPN 在 pixAcc 和 mIoU 上均被 JPU 超越，体现了 JPU 在多级特征融合中的有效性。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。