QUICK REVIEW

[论文解读] CGNet: A Light-weight Context Guided Network for Semantic Segmentation

Tianyi Wu, Sheng Tang|arXiv (Cornell University)|Nov 20, 2018

Advanced Neural Network Applications参考文献 38被引用 83

一句话总结

CGNet 引入一个轻量级的 Context Guided (CG) 块和 CGNet 网络，达到在 0.5M 参数以下的语义分割准确性竞争力（Cityscapes 64.8% mIoU），适用于移动部署。

ABSTRACT

The demand of applying semantic segmentation model on mobile devices has been increasing rapidly. Current state-of-the-art networks have enormous amount of parameters hence unsuitable for mobile devices, while other small memory footprint models follow the spirit of classification network and ignore the inherent characteristic of semantic segmentation. To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation. We first propose the Context Guided (CG) block, which learns the joint feature of both local feature and surrounding context, and further improves the joint feature with the global context. Based on the CG block, we develop CGNet which captures contextual information in all stages of the network and is specially tailored for increasing segmentation accuracy. CGNet is also elaborately designed to reduce the number of parameters and save memory footprint. Under an equivalent number of parameters, the proposed CGNet significantly outperforms existing segmentation networks. Extensive experiments on Cityscapes and CamVid datasets verify the effectiveness of the proposed approach. Specifically, without any post-processing and multi-scale testing, the proposed CGNet achieves 64.8% mean IoU on Cityscapes with less than 0.5 M parameters. The source code for the complete system can be found at https://github.com/wutianyiRosun/CGNet.

研究动机与目标

在内存和计算预算有限的情况下，推动移动设备上的语义分割。
设计一个在利用局部、周围和全局上下文的同时保留空间细节的网络。
提出一个带有 Context Guided (CG) 块的轻量化骨干 CGNet，可学习局部-周围-全局特征的联合表示。
在保持高分割精度的同时，降低参数数量和内存占用。

提出的方法

引入由局部特征提取、周围上下文提取、联合特征提取和全局上下文提取组成的 CG 块。
对周围上下文使用扩张卷积 (atrous) 以及一个全局上下文通路来对联合特征进行再加权。
应用残差连接（局部与全局残差学习）以改善信息流。
构建三阶段下采样的 CGNet（分辨率为 1/2、1/4、1/8）并使用通道卷积以节省参数。
引入输入注入，将下采样后的输入输入到后续阶段，以增强特征传播。
在 Cityscapes 和 CamVid 上进行训练和评估，不进行后处理或多尺度测试；并与小型体积且高精度的模型进行对比。

实验结果

研究问题

RQ1如何在不牺牲准确性的前提下，使语义分割适合移动设备的高效实现？
RQ2能否用一个联合建模局部特征、周围上下文和全局上下文的块，提升分割表现相较于传统的编码-解码设计？
RQ3在 Cityscapes 和 CamVid 数据集上，使用轻量级、上下文引导块在所有阶段的影响如何？

主要发现

方法	FLOPS (G) ↓	参数量 (M) ↓	内存 (M) ↓	mIoU (%) ↑	时间 (ms) ↓
CGNet_M3N21	6.0	0.5	334.0	64.8	56.8

CGNet 在 Cityscapes 测试集上以不到 0.5M 参数获得 64.8% 的 mean IoU。
CGNet 在等同参数量下优于其他小型体积模型（例如 ENet、ESPNet）在 Cityscapes 上的表现。
全局上下文和周围上下文组件显著提升性能；消融实验显示在使用完整的周围上下文和全局上下文模块时获得显著提升。
CGNet 仅使用三阶段下采样（1/8 分辨率）和通道卷积以最小化参数和内存使用。
在 Cityscapes 上，CGNet_M3N21 以 0.5M 参数达到 64.8% mIoU 且运行时竞争力；在 CamVid 上，以 0.5M 参数达到 65.6% mIoU。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。