QUICK REVIEW

[论文解读] Effective Pedestrian Detection Using Center-symmetric Local Binary/Trinary Patterns

Yongbin Zheng, Chunhua Shen|arXiv (Cornell University)|Sep 5, 2010

Video Surveillance and Tracking Methods参考文献 33被引用 26

一句话总结

本文提出了一种密集且分层的中心对称局部二值/三值模式（CS-LBP/LTP），用于行人检测，通过利用梯度和显著纹理信息，在计算成本较低的前提下实现了高性能。在INRIA数据集上，采用直方图交叉核SVM的分层CS-LTP特征优于HOG和PHOG基线方法，且与PHOG结合后可达到最先进水平的检测准确率。

ABSTRACT

Accurately detecting pedestrians in images plays a critically important role in many computer vision applications. Extraction of effective features is the key to this task. Promising features should be discriminative, robust to various variations and easy to compute. In this work, we present novel features, termed dense center-symmetric local binary patterns (CS-LBP) and pyramid center-symmetric local binary/ternary patterns (CS-LBP/LTP), for pedestrian detection. The standard LBP proposed by Ojala et al. \cite{c4} mainly captures the texture information. The proposed CS-LBP feature, in contrast, captures the gradient information and some texture information. Moreover, the proposed dense CS-LBP and the pyramid CS-LBP/LTP are easy to implement and computationally efficient, which is desirable for real-time applications. Experiments on the INRIA pedestrian dataset show that the dense CS-LBP feature with linear supporct vector machines (SVMs) is comparable with the histograms of oriented gradients (HOG) feature with linear SVMs, and the pyramid CS-LBP/LTP features outperform both HOG features with linear SVMs and the start-of-the-art pyramid HOG (PHOG) feature with the histogram intersection kernel SVMs. We also demonstrate that the combination of our pyramid CS-LBP feature and the PHOG feature could significantly improve the detection performance-producing state-of-the-art accuracy on the INRIA pedestrian dataset.

研究动机与目标

解决在复杂背景和姿态变化等挑战性条件下，行人检测中对判别性强、鲁棒性高且计算效率高的特征的需求。
克服标准LBP的局限性，后者会捕捉过多纹理细节，产生高维描述子，不适合用于行人检测。
开发一种能比HOG类特征更有效地捕捉形状和显著纹理信息的特征，尤其在噪声或杂乱环境中表现更优。
证明中心对称LBP和LTP特征在性能上可与最先进水平的HOG和PHOG特征相媲美甚至超越，同时计算速度更快。
研究将CS-LBP/LTP与PHOG结合以进一步提升检测准确率的有效性。

提出的方法

提出密集CS-LBP作为类似HOG的特征提取方法，通过在图像块的密集网格上计算中心对称模式，捕捉局部梯度和纹理结构。
通过在多个空间尺度上计算特征并聚合为分层描述子，提出分层CS-LBP/LTP作为多尺度特征，类比于PHOG。
使用中心对称局部二值模式（CS-LBP）对中心像素周围的局部强度差异进行编码，聚焦于类似梯度的结构，降低对噪声的敏感性。
将CS-LBP扩展为中心对称局部三值模式（CS-LTP），通过采用强度差异的三级量化，提升在均匀区域的鲁棒性。
对密集CS-LBP使用线性SVM，对分层特征使用直方图交叉核SVM（HIKSVM），以在INRIA数据集上评估性能。
通过平均其核矩阵，将分层均匀CS-LBP与PHOG特征结合，构建融合核以提升分类性能。

实验结果

研究问题

RQ1中心对称局部二值/三值模式（CS-LBP/LTP）能否作为有效的行人检测特征，其在准确率和效率上是否优于传统的HOG和PHOG特征？
RQ2当与线性SVM结合时，密集CS-LBP在INRIA行人检测基准上的性能与HOG相比如何？
RQ3当与HIKSVM结合时，分层CS-LBP/LTP特征是否在检测准确率上优于使用相同分类器的PHOG？
RQ4将分层CS-LBP与PHOG结合是否能显著提升检测性能，若能，提升幅度是多少？
RQ5CS-LBP/LTP的计算复杂度与PHOG相比如何，是否适合实时应用？

主要发现

使用线性SVM的密集CS-LBP特征在INRIA数据集上的检测准确率与使用线性SVM的HOG特征相当。
使用HIKSVM的分层CS-LTP特征在INRIA数据集上优于使用HIKSVM的PHOG特征和使用线性SVM的HOG特征。
分层均匀CS-LBP特征的性能略逊于PHOG，但仍优于使用线性SVM的HOG检测器。
通过核矩阵平均将分层均匀CS-LBP与PHOG结合后，在0.25 FPPI下检测准确率提升约6%，在0.5–1 FPPI下提升约1.5%。
PHOG + 分层均匀CS-LBP检测器在INRIA数据集上实现了最先进水平的性能，在所有FPPI水平下均表现出更优的检测率。
所提出的特征计算效率高且易于实现，适用于实时行人检测应用。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。