QUICK REVIEW

[论文解读] Illuminating Pedestrians via Simultaneous Detection & Segmentation

Garrick Brazil, Xi Yin|arXiv (Cornell University)|Jun 26, 2017

Advanced Neural Network Applications参考文献 27被引用 69

一句话总结

本文提出 SDS-RCNN，一种多任务学习框架，通过分割注入层同时执行行人检测与语义分割，以增强特征图。通过将分割监督注入共享主干网络层，该方法在 Caltech 数据集上实现了 23% 的相对误差降低，同时推理速度比竞争方法快 2 倍。

ABSTRACT

Pedestrian detection is a critical problem in computer vision with significant impact on safety in urban autonomous driving. In this work, we explore how semantic segmentation can be used to boost pedestrian detection accuracy while having little to no impact on network efficiency. We propose a segmentation infusion network to enable joint supervision on semantic segmentation and pedestrian detection. When placed properly, the additional supervision helps guide features in shared layers to become more sophisticated and helpful for the downstream pedestrian detector. Using this approach, we find weakly annotated boxes to be sufficient for considerable performance gains. We provide an in-depth analysis to demonstrate how shared layers are shaped by the segmentation supervision. In doing so, we show that the resulting feature maps become more semantically meaningful and robust to shape and occlusion. Overall, our simultaneous detection and segmentation framework achieves a considerable gain over the state-of-the-art on the Caltech pedestrian dataset, competitive performance on KITTI, and executes 2x faster than competitive methods.

研究动机与目标

利用语义分割监督提升在 Caltech 和 KITTI 等基准数据集上的行人检测准确率。
通过利用弱监督分割信号，解决行人数据集中像素级标注有限的挑战。
设计一种多任务学习框架，在不牺牲推理效率的前提下增强特征表示能力。
证明联合训练中引入分割注入可生成更具语义意义且更鲁棒的特征，以提升行人检测性能。
在 Caltech 上实现最先进性能，同时保持高速推理，优于现有方法在准确率与速度上的综合表现。

提出的方法

基于 Faster R-CNN 提出两阶段检测框架，对第二阶段分类器进行修改以实现更严格的监督。
引入分割注入层，在训练过程中将语义分割监督注入共享卷积层。
融合 RPN（区域建议网络）与第二阶段分类器（BCN）的得分，以提升检测置信度并减少误检。
采用多任务学习方案，使同一主干网络同时联合训练行人检测与语义分割。
通过可视化特征图，分析分割注入如何增强行人区域的激活并抑制背景。
通过仅在训练阶段应用分割注入，优化推理效率，从而保持高速推理速度。

实验结果

研究问题

RQ1弱监督语义分割能否在不降低推理速度的前提下提升行人检测性能？
RQ2与分割监督联合训练如何影响共享特征图的质量与语义内容？
RQ3RPN 与第二阶段分类器之间的得分融合在多大程度上可减少误检并提升定位精度？
RQ4在两阶段检测框架中，特征共享与网络多样化之间存在何种权衡？
RQ5分割注入是否能在遮挡与姿态变化条件下带来更鲁棒的检测性能？

主要发现

所提出的 SDS-RCNN 在 Caltech 行人检测基准上实现了 23% 的相对误差降低，创下新的最先进水平。
特征图可视化显示，分割注入使网络在行人区域“点亮”而抑制背景，增强了判别能力。
该方法比竞争的最先进方法快约 2 倍，尽管采用多任务学习，仍保持高效率。
RPN 与 BCN 之间的得分融合使误检减少约 22%，尤其能有效纠正高分背景候选框。
在第二阶段分类器中采用更严格监督显著减少了重复检测，提升了定位精度。
最大性能在最小特征共享（如不共享或仅在早期层共享）时实现，表明网络多样化可增强融合效果。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。