QUICK REVIEW

[论文解读] CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Yuhong Li, Xiaofan Zhang|arXiv (Cornell University)|Feb 27, 2018

Video Surveillance and Tracking Methods参考文献 32被引用 168

一句话总结

CSRNet 引入一个深度、端到端的 CNN，使用 VGG-16 前端和扩张后端，在拥挤场景中生成高质量的人群密度图和精确计数，超越最先进的方法。

ABSTRACT

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

研究动机与目标

在高度拥挤的场景中实现准确的人群计数和密度图生成的动机。
开发一个数据驱动、端到端的 CNN，保持分辨率同时扩展感受野。
通过使用更深的单列模型和扩张卷积，改进相对于多列 CNN 架构。

提出的方法

使用 VGG-16 直至前 10 个层作为前端进行二维特征提取。
在后端用扩张卷积替代池化，以在不降低分辨率的情况下增大感受野。
使用预测密度图与真实密度图之间的欧氏损失进行端到端训练。
使用几何自适应高斯核生成真实密度图。
应用数据增强并采用端到端框架进行密度图与计数估计。

实验结果

研究问题

RQ1带扩张卷积的更深单列 CNN 是否在密集人群计数方面优于多列架构？
RQ2通过扩张保持空间分辨率是否提升密度图质量和跨基准的计数精度？
RQ3CSRNet 的密度图在 PSNR/SSIM 方面与真实密度图在各数据集上的比较如何？

主要发现

CSRNet 在 ShanghaiTech Part_A（68.2/115.0）和 Part_B（10.6/16.0）相较于现有方法实现了最先进的 MAE/MSE。
在 UCF_CC_50 上，CSRNet 达到 MAE 266.1 和 MSE 397.5，优于若干基线。
CSRNet 在 WorldExpo’10 的五个场景中实现最佳平均性能（平均 MAE 8.6，SSIM 0. ?）。
在 UCSD 上，CSRNet 报告 MAE 1.16 和 MSE 1.47，与 MCNN 竞争力。
对于 TRANCOS 车辆计数，CSRNet 实现 GAME(0)=3.56, GAME(1)=5.49, GAME(2)=8.57, GAME(3)=15.04，显示出鲁棒的泛化。
CSRNet 提供更高密度图质量，在 ShanghaiTech Part_A 的 PSNR 23.79 和 SSIM 0.76，比 MCNN 和 CP-CNN 更优。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。