QUICK REVIEW

[논문 리뷰] VGGFace2: A dataset for recognising faces across pose and age

Qiong Cao, Li Shen|arXiv (Cornell University)|2017. 10. 23.

Face recognition and analysis참고 문헌 22인용 수 194

한 줄 요약

대규모 얼굴 데이터셋 VGGFace2를 소개하고, 광범위한 포즈와 연령 변화를 포함하며, 이를 학습한 CNN으로 IJB 벤치마크에서 최첨단 결과를 보여준다.

ABSTRACT

In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimize the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS- Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on all the IARPA Janus face recognition benchmarks, e.g. IJB-A, IJB-B and IJB-C, exceeding the previous state-of-the-art by a large margin. Datasets and models are publicly available.

연구 동기 및 목표

레이블 노이즈를 최소화하면서 광범위한 포즈, 연령, 민족, 직업 variation을 가진 대규모 얼굴 데이터셋 생성.
자동 필터링 단계와 수동 필터링 단계를 포함한 강 robust 데이터셋 구축 파이프라인 설명.
VGGFace2에서 학습된 모델이 IJB 벤치마크에서 최첨단 결과를 달성하고 포즈/연령 인식 작업에서도 우수함을 보여준다.

제안 방법

포즈/연령 변 variation을 강조한 Google 이미지 검색에서 9131 신원에 대해 3.31 백만 개의 이미지를 수집.
자동 분류, 거의 중복 제거, 수동 검토를 포함한 다단계 필터링을 적용해 라벨 노이즈를 줄인다.
사전 학습된 분류기를 사용해 포즈(yaw/pitch/roll)와 겉보기에 보이는 연령을 주석화한다.
VGGFace2, MS-Celeb-1M 및 그들의 합집합에서 ResNet-50 및 SE-ResNet-50 모델을 학습하고; IJB-A/B/C 벤치마크에서 평가한다.
크로스 포즈 및 크로스-age 인식 평가를 위한 포즈 및 연령 템플릿을 제공한다.

실험 결과

연구 질문

RQ1같은 신원 내의 포즈와 연령 변화가 얼굴 인식 성능에 어떤 영향을 미치는가?
RQ2거친/noisy 데이터셋(MS-Celeb-1M)에서의 사전 학습이 VGGFace2로 미세조정하면 일반화가 향상되는가?
RQ3VGGFace2에서 학습된 모델은 다른 데이터셋으로 학습된 모델과 비교하여 IJB-A/B/C 벤치마크에서 어떤 성능을 보이는가?
RQ4포즈 및 연령 variation이 다른 템플릿(pose/age) 간의 인식에 어떤 영향을 미치는가?

주요 결과

VGGFace2에서의 학습은 VGGFace2 테스트 세트에서 top-1 error가 3.9%로 낮고, VGGFace는 10.6%, MS1M은 5.6%이다.
VGGFace2에서 학습된 모델은 IJB-A 검증 및 식별 지표에서 MS1M 및 VGGFace보다 우수하다.
SE-ResNet-50 및 SENet 변형은 VGGFace2에서 학습되어 IJB-A, IJB-B, 및 IJB-C 벤치마크에서 여러 프로토콜에 대해 최첨단 결과를 달성했다.
포즈 및 연령 템플릿은 포즈가 비슷할 때 인식이 더 쉽고, 연령 간 매칭은 여전히 어려운 것으로 나타났으며, VGGFace2 모델이 다른 모델보다 더 높은 유사도 점수를 제공했다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.