QUICK REVIEW

[论文解读] 'Skimming-Perusal' Tracking: A Framework for Real-Time and Robust Long-term Tracking

Bin Yan, Haojie Zhao|arXiv (Cornell University)|Sep 4, 2019

Video Surveillance and Tracking Methods参考文献 40被引用 34

一句话总结

引入一个实时长时跟踪框架，包含两个离线组件：一个用于本地跟踪的 perusal 模块（SiameseRPN regressor + 离线验证器），以及一个用于快速全图重新检测的 skimming 模块，在 VOT2018LT 和 OxUvA 上达到最先进的结果并实现实时运行。

ABSTRACT

Compared with traditional short-term tracking, long-term tracking poses more challenges and is much closer to realistic applications. However, few works have been done and their performance have also been limited. In this work, we present a novel robust and real-time long-term tracking framework based on the proposed skimming and perusal modules. The perusal module consists of an effective bounding box regressor to generate a series of candidate proposals and a robust target verifier to infer the optimal candidate with its confidence score. Based on this score, our tracker determines whether the tracked object being present or absent, and then chooses the tracking strategies of local search or global search respectively in the next frame. To speed up the image-wide global search, a novel skimming module is designed to efficiently choose the most possible regions from a large number of sliding windows. Numerous experimental results on the VOT-2018 long-term and OxUvA long-term benchmarks demonstrate that the proposed method achieves the best performance and runs in real-time. The source codes are available at https://github.com/iiau-tracker/SPLT.

研究动机与目标

通过实现对目标存在/缺失的鲁棒检测以及快速重新检测，弥合短期跟踪与现实世界长时跟踪之间的差距。
开发一个两模块框架（perusal 用于本地跟踪，skimming 用于快速全局搜索），该框架使用离线训练的组件。
在长时基准上实现实时性能，同时不牺牲准确性。
提供一个基于深度网络的长时跟踪实用基线。

提出的方法

带离线训练的 SiameseRPN regressor 的 perusal 模块，在局部搜索区域内生成候选边界框，且有离线训练的 verifier 对候选框进行打分并选择最佳候选。
验证使用通过三元组损失训练的深度特征嵌入来计算基于余弦相似度的置信分数。
Skimming 模块学习一个二元分类器，快速在大量滑动窗口区域预测目标的存在，并选择前 K 个候选进行进一步处理。
基于 verifier 的置信度进行本地与全局搜索的动态切换，阈值 theta（实验中为 0.65）。
采用级联训练策略，利用从错误分类中挖掘的难例来改进 verifier。
实现使用 MobileNetV1 进行回归和 skimming 的特征提取，ResNet50 作为 verifier 的骨干网络，图像尺寸为 (template 127x127, search 300x300)。

实验结果

研究问题

RQ1如何在使用离线训练组件的情况下实现具有实时性能的鲁棒长时跟踪？
RQ2两模块的 Skimming-Perusal 框架是否能在长时基准上提升重新检测的速度和准确性？
RQ3哪些有效的训练策略（包括级联训练）能够提升 verifier 的鲁棒性？
RQ4在长时跟踪中，本地搜索与全局搜索的动态切换如何影响精度和速度？

主要发现

SPLT 跟踪器在 VOT2018LT 上的 F-score 和 recall 方面达到领先水平，实验帧率为 25.7 fps。
在 VOT2018LT 的准确性指标和再检测能力方面均超越 MBMD 和 DaSiam_LT（再检测成功率达到 100%）。
在 OxUvA 长时跟踪上，SPLT 获得最高的 MaxGM，在 MaxGM、TPR 和 TNR 比 MBMD 和 SiamFC+R 有显著提升。
专门的 skimming 模块显著加速全图重新检测并通过过滤干扰项提高鲁棒性。
平均消融实验表明，添加 verifier 和 skimming 模块相对于基线本地搜索，在 F-score 和速度上有显著提升。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。