QUICK REVIEW

[論文レビュー] Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking

Feng Li, Tian Cheng|arXiv (Cornell University)|Mar 23, 2018

Video Surveillance and Tracking Methods参考文献 2被引用数 70

ひとこと要約

STRCFはSRDCFへ時間的正則化を導入し、ADMMで解く。複数のベンチマークでSRDCFよりリアルタイム追跡速度と精度を向上させる。

ABSTRACT

Discriminative Correlation Filters (DCF) are efficient in visual tracking but suffer from unwanted boundary effects. Spatially Regularized DCF (SRDCF) has been suggested to resolve this issue by enforcing spatial penalty on DCF coefficients, which, inevitably, improves the tracking performance at the price of increasing complexity. To tackle online updating, SRDCF formulates its model on multiple training images, further adding difficulties in improving efficiency. In this work, by introducing temporal regularization to SRDCF with single sample, we present our spatial-temporal regularized correlation filters (STRCF). Motivated by online Passive-Agressive (PA) algorithm, we introduce the temporal regularization to SRDCF with single sample, thus resulting in our spatial-temporal regularized correlation filters (STRCF). The STRCF formulation can not only serve as a reasonable approximation to SRDCF with multiple training samples, but also provide a more robust appearance model than SRDCF in the case of large appearance variations. Besides, it can be efficiently solved via the alternating direction method of multipliers (ADMM). By incorporating both temporal and spatial regularization, our STRCF can handle boundary effects without much loss in efficiency and achieve superior performance over SRDCF in terms of accuracy and speed. Experiments are conducted on three benchmark datasets: OTB-2015, Temple-Color, and VOT-2016. Compared with SRDCF, STRCF with hand-crafted features provides a 5 times speedup and achieves a gain of 5.4% and 3.6% AUC score on OTB-2015 and Temple-Color, respectively. Moreover, STRCF combined with CNN features also performs favorably against state-of-the-art CNN-based trackers and achieves an AUC score of 68.3% on OTB-2015.

研究の動機と目的

視覚追跡の識別相関フィルタ（DCF）における境界効果に対処する。
時間的正則化を用いて単一フレームから更新する時空間正則化DCF（STRCF）を提案する。
閉形式の部分問題を持つ効率的なADMMベースの解法を開発する。
STRCFは大きな外観変動下で頑健な外観モデルを提供しつつリアルタイム速度を維持することを示す。

提案手法

SRDCFに時間的正則化項 mu/2 * ||f - f_{t-1}||^2 を導入し、STRCFを形成する（式(2)）。
補助変数 g を導入し交互更新するADMMを用いて凸なSTRCF目的関数を解く。
f-サブ問題では、Parsevalの定理と Sherman–Morrison 公式を用いて周波数領域で画素ごとに効率的に解く（式(9)–(12)。

実験結果

リサーチクエスチョン

RQ1STRCFは複数の訓練画像で学習されたSRDCFモデルを近似しつつ、より高い効率を維持できるか。
RQ2時間的正則化を取り入れることで、SRDCFと比較して外観の変化や遮蔽に対する頑健性が向上するか。
RQ3時間的正則化パラメータ mu が追跡性能に与える影響は何か。
RQ4手工特徴量と深層特徴量の両方でリアルタイム性能を達成しつつ、競争力のある精度を維持できるか。

主な発見

STRCFはOTB-2015とTemple-ColorでSRDCFに対して平均OPを約5.7%向上させる。
STRCFは手工特徴量でリアルタイム（約30 FPS）で動作し、STRCF(HOG)は31.5 FPS、STRCF(HOGCN)は24.3 FPS。
時間的正則化を用いたSTRCFは頑健な更新を提供し、OV属性で最大14.5%、OCC属性で最大5.7%の向上をSRDCF系と比較して達成。
DeepSTRCF（CNN特徴を用いるSTRCF）はOTB-2015で平均OP 84.2%を達成し、DeepSRDCFより7.4%上回る。
VOT-2016ではSTRCFがEAO0.279、DeepSTRCFが0.313を達成し、CNN強化版の中でDeepSTRCFが高いEAOを示す。
Temple-ColorではSTRCFはECO-HCと競合し、DeepSTRCFがそのデータセットで報告結果の中で最高の性能を示す。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。