QUICK REVIEW

[论文解读] Bayesian Loss for Crowd Count Estimation with Point Supervision

Zhiheng Ma, Xing Wei|arXiv (Cornell University)|Aug 10, 2019

Video Surveillance and Tracking Methods参考文献 59被引用 66

一句话总结

本论文提出了用于人群计数的贝叶斯损失，使用点注释监督计数期望而非像素级密度，在没有外部检测器的情况下在主要基准上达到最新结果。

ABSTRACT

In crowd counting datasets, each person is annotated by a point, which is usually the center of the head. And the task is to estimate the total count in a crowd scene. Most of the state-of-the-art methods are based on density map estimation, which convert the sparse point annotations into a "ground truth" density map through a Gaussian kernel, and then use it as the learning target to train a density map estimator. However, such a "ground-truth" density map is imperfect due to occlusions, perspective effects, variations in object shapes, etc. On the contrary, we propose \emph{Bayesian loss}, a novel loss function which constructs a density contribution probability model from the point annotations. Instead of constraining the value at every pixel in the density map, the proposed training loss adopts a more reliable supervision on the count expectation at each annotated point. Without bells and whistles, the loss function makes substantial improvements over the baseline loss on all tested datasets. Moreover, our proposed loss function equipped with a standard backbone network, without using any external detectors or multi-scale architectures, plays favourably against the state of the arts. Our method outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset. The source code is available at \url{https://github.com/ZhihengCV/Baysian-Crowd-Counting}.

研究动机与目标

以点注释为基础推动人群计数，并指出真实密度地图的局限性。
提出一种贝叶斯损失，在标注点监督计数期望。
展示在基准数据集上的鲁棒性、泛化能力以及最先进的性能。

提出的方法

从点注释构建密度贡献概率模型 p(xm|yn)。
使用贝叶斯规则在先验相等的前提下计算后验 p(yn|xm)。
通过在标注点监督计数期望来定义贝叶斯损失 LBayes（p(yn|xm) 的和乘以 Dest(xm)）。
扩展到一个背景标签 y0，并引入一个虚拟背景点来建模背景像素（LBayes+）。
可视化标签后验熵以分析定位与边界。

实验结果

研究问题

RQ1基于以计数为中心的贝叶斯监督的损失函数能否在标准人群计数基准上超越像素级密度监督？
RQ2引入背景模型（Bayesian+，含虚拟点）是否提高对背景像素和注释噪声的鲁棒性？
RQ3所提出的损失在不同骨干网络和不同数据集上的表现如何？
RQ4高斯核参数和边距 d 对性能和鲁棒性有何影响？

主要发现

数据集	BASELINE MAE	BASELINE MSE	BAYESIAN MAE	BAYESIAN MSE	BAYESIAN+ MAE	BAYESIAN+ MSE
UCF-QNRF	106.8	183.7	92.9	163.0	88.7	154.8
ShanghaiTechA	68.6	110.1	64.5	104.0	62.8	101.8
ShanghaiTechB	8.5	13.9	7.9	13.3	7.7	12.7
UCF CC 50	251.6	331.3	237.7	320.8	229.3	308.2

BAYESIAN+ 在四个基准数据集上达到最先进的准确率，无需外部检测器或多尺度结构。
BAYESIAN+ 在所有四个数据集上一致地比 BAYESIAN 提升约 3%。
在所有数据集上，BAYESIAN 与 BAYESIAN+ 都比 BASELINE 有较大优势。
在 UCF-QNRF 上，BAYESIAN+ 显著优于之前的最佳方法 CL-CNN，差距较大（如所述）。
该方法在密集区域提供更准确的密度定位，并通过 y0 建模实现对背景的更好处理。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。