QUICK REVIEW

[論文レビュー] GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition

Hanqing Chao, Yiwei He|arXiv (Cornell University)|Nov 15, 2018

Gait Recognition and Analysis参考文献 26被引用数 59

ひとこと要約

GaitSet は歩行を silhouette の集合として扱い、置換不変な Set Pooling と Horizontal Pyramid Mapping によってシ gait 認識の状態を跨ぐ視点に対して最先端を達成する。

ABSTRACT

As a unique biometric feature that can be recognized at a distance, gait has broad applications in crime prevention, forensic identification and social security. To portray a gait, existing gait recognition methods utilize either a gait template, where temporal information is hard to preserve, or a gait sequence, which must keep unnecessary sequential constraints and thus loses the flexibility of gait recognition. In this paper we present a novel perspective, where a gait is regarded as a set consisting of independent frames. We propose a new network named GaitSet to learn identity information from the set. Based on the set perspective, our method is immune to permutation of frames, and can naturally integrate frames from different videos which have been filmed under different scenarios, such as diverse viewing angles, different clothes/carrying conditions. Experiments show that under normal walking conditions, our single-model method achieves an average rank-1 accuracy of 95.0% on the CASIA-B gait dataset and an 87.1% accuracy on the OU-MVLP gait dataset. These results represent new state-of-the-art recognition accuracy. On various complex scenarios, our model exhibits a significant level of robustness. It achieves accuracies of 87.2% and 70.4% on CASIA-B under bag-carrying and coat-wearing walking conditions, respectively. These outperform the existing best methods by a large margin. The method presented can also achieve a satisfactory accuracy with a small number of frames in a test sample, e.g., 82.5% on CASIA-B with only 7 frames. The source code has been released at https://github.com/AbnerHqC/GaitSet.

研究の動機と目的

視点と条件の変動に頑健な歩行認識を、逐次的な制約や単一テンプレートに依存せずに動機づける。
silhouette の集合から学習するための permutation-invariant な Set-based フレームワークを提案する。
高次特徴の集約を通じて時間・空間情報を保持するメカニズムを開発する。
大規模データセットと多様な歩行条件に対する頑健性とスケーラビリティを実証する。

提案手法

歩行をシーケンスや単一テンプレートではなく silhouette の集合として表現する。
各 silhouette からフレームレベルの特徴を独立に抽出するために CNN を用いる。
Set Pooling を適用してフレームレベルの特徴を集合レベルの表現へ permutation-invariant に集約する。
attention-enhanced pooling と最大/平均/中央値の複数の統計的集約を組み合わせて頑健な集合特徴を形成する。
Horizontal Pyramid Mapping (HPM) を用い、マルチスケールのストリッププーリングで集合特徴を識別空間へ写像する。
必要に応じて Multilayer Global Pipeline (MGP) によって複数の畳み込み層からの特徴を統合して多層情報を活用する。

実験結果

リサーチクエスチョン

RQ1 silhouette の unordered な集合から認識はテンプレートやシーケンスなしでも効果的に行えるか。
RQ2 permutation-invariant Set Pooling は cross-view および cross-condition シナリオで認識精度にどのような影響を与えるか。
RQ3 マルチスケール Horizontal Pyramid Mapping と多層情報統合が識別性に与える影響はどの程度か。
RQ4 この手法は大規模データセットと様々な視認条件に対してどのようにスケールするか。
RQ5 限られた silhouette や異なる視点/条件を組み合わせた場合でもモデルは高い精度を維持できるか。

主な発見

GaitSet は CASIA-B で通常歩行時の平均で rank-1 精度 95.0%、 OU-MVLP で 87.1% を達成し、従来の方法を上回る。
CASIA-B のバッグ持ち/コート着用条件でそれぞれ 87.2% および 70.4% を得て、既存手法を上回る。
GaitSet は CASIA-B で 7フレームのみで 82.5% の精度を達成し、入力が少ない場合の頑健性を示す。
アブレーションにより集合ベースの入力が GEI テンプレートを大幅に上回り、NM サブセットで最大 10% 以上、CL サブセットで 25% 以上の改善を示す。
マルチビュー入力（2ビュー）は一般に精度を向上させ、ビュー間情報統合の能力を示す。
本手法は効率的にスケールし、例として OU-MVLP の 133,780 シーケンスを 8 GPU で約 7 分で評価した。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。