QUICK REVIEW

[논문 리뷰] GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition

Hanqing Chao, Yiwei He|arXiv (Cornell University)|2018. 11. 15.

Gait Recognition and Analysis참고 문헌 26인용 수 59

한 줄 요약

GaitSet은 보행을 순서를 가지지 않는 silhouette의 집합으로 간주하고, permutation-invariant Set Pooling과 Horizontal Pyramid Mapping을 통해 크로스-뷰 보행 인식에서 최첨단 성능을 달성하며 뷰(view), 의상 및 휴대 조건에 강건합니다.

ABSTRACT

As a unique biometric feature that can be recognized at a distance, gait has broad applications in crime prevention, forensic identification and social security. To portray a gait, existing gait recognition methods utilize either a gait template, where temporal information is hard to preserve, or a gait sequence, which must keep unnecessary sequential constraints and thus loses the flexibility of gait recognition. In this paper we present a novel perspective, where a gait is regarded as a set consisting of independent frames. We propose a new network named GaitSet to learn identity information from the set. Based on the set perspective, our method is immune to permutation of frames, and can naturally integrate frames from different videos which have been filmed under different scenarios, such as diverse viewing angles, different clothes/carrying conditions. Experiments show that under normal walking conditions, our single-model method achieves an average rank-1 accuracy of 95.0% on the CASIA-B gait dataset and an 87.1% accuracy on the OU-MVLP gait dataset. These results represent new state-of-the-art recognition accuracy. On various complex scenarios, our model exhibits a significant level of robustness. It achieves accuracies of 87.2% and 70.4% on CASIA-B under bag-carrying and coat-wearing walking conditions, respectively. These outperform the existing best methods by a large margin. The method presented can also achieve a satisfactory accuracy with a small number of frames in a test sample, e.g., 82.5% on CASIA-B with only 7 frames. The source code has been released at https://github.com/AbnerHqC/GaitSet.

연구 동기 및 목표

순차적 제약이나 단일 템플릿에 의존하지 않으면서 보기(view) 및 조건 변화에 견고한 보행 인식을 촉진한다.
silhouette 집합에서 학습하기 위한 permutation-invariant Set 기반 프레임워크를 제안한다.
고수준 특징 집계를 통해 시간적/공간 정보를 보존하는 메커니즘을 개발한다.
대규모 데이터셋과 다양한 보행 조건에 대한 견고성과 확장성을 입증한다.

제안 방법

보행을 시퀀스나 단일 템플릿이 아니라 silhouette의 집합으로 표현한다.
각 silhouette에서 프레임 수준 특징을 독립적으로 추출하기 위해 CNN을 사용한다.
Permutation-invariant 방식으로 프레임 수준 특징을 집합 수준 표현으로 집계하기 위해 Set Pooling을 적용한다.
주목도(attention) 향상 풀링과 다중 통계 집계(max/mean/median)를 포함하여 강건한 세트 특징을 형성한다.
다중 스케일 스트립 풀링이 있는 Horizontal Pyramid Mapping (HPM)을 사용해 세트 특징을 판별력 있는 공간으로 매핑한다.
다중 레벨 정보를 위해 Multilayer Global Pipeline (MGP)을 통해 여러 합성곱 층의 특징을 선택적으로 융합한다.

실험 결과

연구 질문

RQ1템플릿이나 시퀀스가 아닌 무작위 순서의 silhouette 집합으로 보행을 효과적으로 인식할 수 있는가?
RQ2Permutation-invariant Set Pooling이 크로스 뷰 및 크로스 조건 시나리오에서 인식 정확도에 어떤 영향을 미치는가?
RQ3다중 스케일의 horizontal pyramid 매핑과 다층 정보 융합이 구별력에 어떤 영향을 미치는가?
RQ4이 방법은 대규모 데이터셋과 다양한 시야 조건에 어떻게 확장되는가?
RQ5제한된 silhouette로도 높은 정확도를 유지할 수 있는가, 또는 서로 다른 뷰/조건을 결합할 때도 가능한가?

주요 결과

GaitSet은 표준 설정에서 CASIA-B(정규 보행 하의 평균 95.0%)와 OU-MVLP(87.1%)에서 높은 rank-1 정확도를 달성하여 기존 방법을 능가한다.
CASIA-B에서 가방을 들고 걷는 조건과 코트를 입은 조건에서 각각 87.2%와 70.4%를 얻으며 기존 방법을 능가한다.
GaitSet은 CASIA-B에서 7 프레임만으로 82.5% 정확도를 달성하여 입력이 제한적일 때의 강건함을 보여준다.
특성 제거 실험에서 세트 기반 입력이 GEI 템플릿을 상당히 능가함을 보이며 NM 하위집합에서 최대 10%포인트 이상, CL 하위집합에서 25%포인트 이상 개선된다.
다중 뷰 입력(두 뷰)은 일반적으로 정확도를 높이며, 뷰 간 정보를 융합하는 모델의 능력을 보여준다.
이 방법은 효율적으로 확장되며, 예를 들어 OU-MVLP의 133,780 시퀀스를 8 GPUs에서 약 7분 만에 평가한다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.