QUICK REVIEW

[论文解读] GradNet: Gradient-Guided Network for Visual Object Tracking

Peixia Li, Boyu Chen|arXiv (Cornell University)|Sep 15, 2019

Video Surveillance and Tracking Methods参考文献 42被引用 49

一句话总结

GradNet 引入基于梯度的模板更新，用于 Siamese 基于的视觉跟踪，在线使用梯度信息更新模板并结合模板泛化训练方法，在保持实时速度的同时提高准确性。

ABSTRACT

The fully-convolutional siamese network based on template matching has shown great potentials in visual tracking. During testing, the template is fixed with the initial target feature and the performance totally relies on the general matching ability of the siamese network. However, this manner cannot capture the temporal variations of targets or background clutter. In this work, we propose a novel gradient-guided network to exploit the discriminative information in gradients and update the template in the siamese network through feed-forward and backward operations. Our algorithm performs feed-forward and backward operations to exploit the discriminative informaiton in gradients and capture the core attention of the target. To be specific, the algorithm can utilize the information from the gradient to update the template in the current frame. In addition, a template generalization training method is proposed to better use gradient information and avoid overfitting. To our knowledge, this work is the first attempt to exploit the information in the gradient for template update in siamese-based trackers. Extensive experiments on recent benchmarks demonstrate that our method achieves better performance than other state-of-the-art trackers.

研究动机与目标

Motivate online adaptation for Siamese-based trackers to handle appearance changes and background clutter.
Leverage gradient information to update the tracking template in the current frame.
Develop a lightweight, end-to-end network (GradNet) that adapts templates with one backward pass.
Prevent online overfitting and improve generalization through a template generalization training method.

提出的方法

Two-branch architecture: a search-region feature extractor and an update branch that generates a new template from gradients.
Initial embedding of the target feature to form an initial template via a sub-network U1.
Compute the gradient of the loss with respect to the target feature and process it through a second sub-network U2 to produce a gradient-based update.
Update the template by combining the gradient-driven update with the initial target feature, then re-derive an optimal template for final scoring.
Train the update branch with second-order gradients and a template generalization strategy that uses cross-video search regions to avoid overfitting.
During online tracking, update the template every few frames and fuse it with the initial template to balance adaptation and stability.]
research_questions: ["Can gradient information be exploited to update the Siamese tracker template in real time?","Does a template generalization training strategy reduce overfitting and improve generalization across diverse video domains?","How does GradNet compare to online-update trackers and pure offline Siamese trackers in accuracy and speed?"]
key_findings:[

实验结果

研究问题

RQ1Can gradient information be exploited to update the Siamese tracker template in real time?
RQ2Does a template generalization training strategy reduce overfitting and improve generalization across diverse video domains?
RQ3How does GradNet compare to online-update trackers and pure offline Siamese trackers in accuracy and speed?

主要发现

GradNet 在标准 GPU/CPU 设置下实现实时跟踪，达到 80 帧/秒。
基于梯度的更新相较 SiameseFC 基线在精度和成功率指标上有所提升。
模板泛化训练减少了过拟合，促使更新分支更多依赖判别梯度而非单纯外观。
单次反向传播加两次前向传播即可有效更新模板，在速度与准确性之间取得平衡。
消融研究表明各分量（梯度使用、模板泛化、在线更新）均对性能提升有贡献。
在四个基准上，GradNet 提供与最先进的实时跟踪器相竞争或更优的表现。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。