QUICK REVIEW

[论文解读] Distribution Matching for Crowd Counting

Boyu Wang, Huidong Liu|arXiv (Cornell University)|Sep 28, 2020

Video Surveillance and Tracking Methods参考文献 60被引用 169

一句话总结

DM-Count 使用最优传输来将归一化的预测密度图与归一化的真实密度图匹配，避免高斯平滑，在多个人工密度计数数据集上实现了最先进的结果。

ABSTRACT

In crowd counting, each training image contains multiple people, where each person is annotated by a dot. Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annotated point. In this paper, we show that imposing Gaussians to annotations hurts generalization performance. Instead, we propose to use Distribution Matching for crowd COUNTing (DM-Count). In DM-Count, we use Optimal Transport (OT) to measure the similarity between the normalized predicted density map and the normalized ground truth density map. To stabilize OT computation, we include a Total Variation loss in our model. We show that the generalization error bound of DM-Count is tighter than that of the Gaussian smoothed methods. In terms of Mean Absolute Error, DM-Count outperforms the previous state-of-the-art methods by a large margin on two large-scale counting datasets, UCF-QNRF and NWPU, and achieves the state-of-the-art results on the ShanghaiTech and UCF-CC50 datasets. DM-Count reduced the error of the state-of-the-art published result by approximately 16%. Code is available at https://github.com/cvlab-stonybrook/DM-Count.

研究动机与目标

通过展示注释的高斯平滑会削弱人群计数的泛化能力来激发本研究。
提出一个分布匹配框架（DM-Count），使用最优传输在没有高斯平滑的情况下比较预测密度图和真实密度图。
通过 Total Variation 损失稳定 OT 的计算，并为所提出的损失提供泛化界限。
在四个大型人群计数基准数据集上展示相对于此前方法的实证改进。

提出的方法

将计数问题定义为真实点注释与预测密度图之间的分布匹配。
使用最优传输在归一化的真实密度图和预测密度图之间计算一个运输损失。
引入一个基于总质量绝对差异的计数损失以对齐计数。
结合 Total Variation 损失以稳定基于 Sinkhorn 的 OT 优化并改善低密度区域。
将计数损失、OT 损失和 TV 损失整合到一个可调权重的单一训练目标中。

实验结果

研究问题

RQ1通过最优传输的分布匹配在没有高斯平滑的真实标签的情况下是否能改进人群计数？
RQ2在泛化性和定位质量方面，基于 OT 的损失与高斯平滑和贝叶斯损失相比如何？
RQ3在使用 Sinkhorn 近似进行 OT 时，稳定性增强（Total Variation）是否能改善训练？
RQ4与先前的最先进方法相比，DM-Count 在大规模数据集上的经验增益是什么？

主要发现

DM-Count 在四个数据集（UCF-QNRF、NWPU、ShanghaiTech、UCF-CC50）的 MAE、RMSE 和 NAE 上超越了先前的最先进方法。
在 NWPU 上，DM-Count 将公开的 MAE 和 NAE 降至显著的幅度（例如 MAE 从 105.4 降至 88.4；NAE 从 0.203 降至 0.169）。
DM-Count 还提高了密度图的质量，在关键基准上达到比像素级和贝叶斯损失更高的 PSNR 和 SSIM。
理论分析表明，对真实注释进行高斯平滑会产生比 DM-Count 方法更宽松的泛化界限。
消融研究表明 OT 损失是最有影响的组成部分，TV 在训练过程中提供稳定性收益。
DM-Count 展现出对注释噪声的鲁棒性，相较于高斯平滑或贝叶斯损失。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。