QUICK REVIEW

[論文レビュー] Learning regression and verification networks for long-term visual tracking

Yunhua Zhang, Dong Wang|arXiv (Cornell University)|Sep 12, 2018

Video Surveillance and Tracking Methods参考文献 5被引用数 74

ひとこと要約

長期追跡フレームワークを導入し、オフライン回帰ネットワークとオンライン検証ネットワークを組み合わせて局所探索、absent-present decision、画像全体の再検出を実行する。VOT2018 LTB35およびOxUvA長期ベンチマークで最先端を達成。

ABSTRACT

Compared with short-term tracking, the long-term tracking task requires determining the tracked object is present or absent, and then estimating the accurate bounding box if present or conducting image-wide re-detection if absent. Until now, few attempts have been done although this task is much closer to designing practical tracking systems. In this work, we propose a novel long-term tracking framework based on deep regression and verification networks. The offline-trained regression model is designed using the object-aware feature fusion and region proposal networks to generate a series of candidates and estimate their similarity scores effectively. The verification network evaluates these candidates to output the optimal one as the tracked object with its classification score, which is online updated to adapt to the appearance variations based on newly reliable observations. The similarity and classification scores are combined to obtain a final confidence value, based on which our tracker can determine the absence of the target accurately and conduct image-wide re-detection to capture the target successfully when it reappears. Extensive experiments show that our tracker achieves the best performance on the VOT2018 long-term challenge and state-of-the-art results on the OxUvA long-term dataset.

研究の動機と目的

長期追跡におけるターゲットの出現・消失・再出現が繰り返されるギャップに対処する。
オフラインで訓練された回帰ネットワークを開発し、類似スコア付きの候補境界ボックスを生成する。
オンラインで更新される検証ネットワークを組み込み、候補の中から真のターゲットを識別する。
信頼度ドリブンの局所探索と画像全体の再検出の切替を可能にする。
VOT2018 LTB35およびOxUvA長期データセットで優れた性能を実証する。

提案手法

オフライン訓練済み回帰ネットワーク(R)を使用し、物体認識特徴の融合とRegion Proposal Networkを用いて候補境界ボックスを生成・スコア付けする。
検索領域の特徴とテンプレート特徴を融合して、境界ボックス回帰と類似度スコアリングのRPN入力を作る。
オンライン更新される検証ネットワーク(V)を組み込み、候補を前景/背景として分類し最終トラッキングを洗練させる。
回帰スコアと検証スコアを組み合わせて最終フレームごとの信頼度を算出し、有無の決定と必要時の再検出をトリガーする。
信頼度スコアに基づいて局所探索と画像全体の再検出の間を動的に切り替える。
RはSSD風の損失を組み合わせた学習（一致（クロスエントロピー）と局在化（Smooth L1）損失）でオフライン訓練、VはMDNet風のファインチューニングでオンライン訓練。

実験結果

リサーチクエスチョン

RQ1長期追跡における有無決定を扱うために、回帰ネットワークと検証ネットワークをどのように統合できるか。
RQ2オフライン回帰モデルが候補を頑健に提案し、オンライン検証モデルが外観変化に適応できるか。
RQ3信頼度ベースの局所探索と全体再検出の切替は長期追跡性能を改善するか。
RQ4物体認識特徴の融合が候補提案と回帰精度に与える影響は。
RQ5提案手法は標準的な長期ベンチマーク（VOT2018 LTB35, OxUvA）でどのように性能を示すか。

主な発見

VOT-2018 LTB35 で評価対象トラッカーの中で最高のF1スコア、Precision、Recallを達成（F-score 0.610, Pr 0.634, Re 0.588）。
VOT-2018 LTB35で、提供テーブルの1フレーム出現を含む全シークエンスで100%の再検出成功を報告。
OxUvA長期データセット（オープンチャレンジ）でMaxGMスコア0.544、TPR 0.609、TNR 0.485を達成。
検証を追加することで回帰のみの場合より長期性能が大幅に向上することをアブレーションで示し、特徴融合の連結と乗算の双方が有益。
特徴抽出器のシアム配置は、オンライン/オフラインの別ブランチと比較して性能を低下させる傾向を示し、別々の入力処理が必要であることを示唆。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。