QUICK REVIEW

[论文解读] Learning Complexity-Aware Cascades for Deep Pedestrian Detection

Zhaowei Cai, Mohammad Saberian|arXiv (Cornell University)|Jul 19, 2015

Video Surveillance and Tracking Methods参考文献 34被引用 56

一句话总结

该论文提出 CompACT，一种通过优化分类准确率与计算复杂度之间的拉格朗日权衡来学习复杂度感知级联检测器的提升算法。通过将高复杂度特征（如深度卷积神经网络）推至级联的后期阶段，CompACT 实现了多种特征的无缝集成，在 Caltech 和 KITTI 数据集上实现了高速下的最先进行人检测性能。

ABSTRACT

The design of complexity-aware cascaded detectors, combining features of very different complexities, is considered. A new cascade design procedure is introduced, by formulating cascade learning as the Lagrangian optimization of a risk that accounts for both accuracy and complexity. A boosting algorithm, denoted as complexity aware cascade training (CompACT), is then derived to solve this optimization. CompACT cascades are shown to seek an optimal trade-off between accuracy and complexity by pushing features of higher complexity to the later cascade stages, where only a few difficult candidate patches remain to be classified. This enables the use of features of vastly different complexities in a single detector. In result, the feature pool can be expanded to features previously impractical for cascade design, such as the responses of a deep convolutional neural network (CNN). This is demonstrated through the design of a pedestrian detector with a pool of features whose complexities span orders of magnitude. The resulting cascade generalizes the combination of a CNN with an object proposal mechanism: rather than a pre-processing stage, CompACT cascades seamlessly integrate CNNs in their stages. This enables state of the art performance on the Caltech and KITTI datasets, at fairly fast speeds.

研究动机与目标

为解决将高复杂度深度学习特征集成到级联检测器中的挑战，而这类检测器通常仅限于低复杂度特征。
克服现有级联设计的局限性，这些设计假设特征复杂度一致，且缺乏对准确率-复杂度权衡的显式优化。
开发一个统一框架，无缝结合手工设计特征与深度卷积神经网络于单一级联架构中。
证明 CompACT 能在保持高推理速度的同时实现最先进水平的行人检测性能。

提出的方法

将级联学习建模为拉格朗日优化问题，联合最小化分类风险与复杂度风险。
引入一种复杂度度量方法，以量化特征的计算成本，从而实现对准确率与速度之间权衡的显式控制。
推导出一种提升算法 CompACT，通过迭代选择能最大程度降低拉格朗日目标函数的特征来学习级联结构。
将高复杂度特征（如深度卷积神经网络）置于级联的后期阶段，此时仅剩少量困难样本需分类。
支持在单一级联结构中集成复杂度差异极大的特征，例如哈尔小波与深度卷积神经网络。
采用混合架构，其中最终阶段可为深度卷积神经网络，既可作为后 NMS 分类器，也可嵌入级联流程中。

实验结果

研究问题

RQ1当特征复杂度差异显著时，能否训练出一种级联检测器，以显式平衡检测准确率与计算复杂度？
RQ2能否在不依赖独立目标提议阶段的情况下，有效将卷积神经网络集成到级联检测器中？
RQ3复杂度感知的级联设计是否在准确率与速度两方面均优于传统两阶段方法（如提议+CNN）？
RQ4所提方法能否在保持实时推理速度的同时实现最先进水平的行人检测性能？

主要发现

CompACT 在 Caltech 行人检测基准上达到最先进性能，平均精度均值（mAP）比之前方法最高提升 11 个百分点。
CompACT-Deep 级联结构在 KITTI 'easy' 分割上达到 70.69% 的 mAP，超过此前最先进方法超过 8 个百分点。
尽管使用了深度卷积神经网络，CompACT-Deep 级联结构在 KITTI 上的推理速度为每张图像 1 秒，显著快于 R-CNN 和 FilteredICF 等竞争方法。
采用小型卷积神经网络的 CompACT 级联结构在 KITTI 'easy' 分割上达到 65.35% 的 mAP，优于 pAUCEnsT 和 FilteredICF，且速度快得多。
将卷积神经网络嵌入级联流程中（NMS 前）比在 NMS 后应用可获得更高准确率，尽管计算量减少了约 10 倍。
该方法通过实现卷积神经网络在级联流程中的端到端集成，扩展了“目标提议 + CNN”范式，消除了对独立提议阶段的需求。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。