[Paper Review] One-shot Face Recognition by Promoting Underrepresented Classes
The paper tackles imbalanced training data in large-scale face recognition by adding a Classification Vector-Centered Cosine Similarity (CCS) loss to learn robust features, plus an Underrepresented-Classes Promotion (UP) loss to boost one-shot classes during multinomial logistic regression.
In this paper, we study the problem of training large-scale face identification model with imbalanced training data. This problem naturally exists in many real scenarios including large-scale celebrity recognition, movie actor annotation, etc. Our solution contains two components. First, we build a face feature extraction model, and improve its performance, especially for the persons with very limited training samples, by introducing a regularizer to the cross entropy loss for the multi-nomial logistic regression (MLR) learning. This regularizer encourages the directions of the face features from the same class to be close to the direction of their corresponding classification weight vector in the logistic regression. Second, we build a multi-class classifier using MLR on top of the learned face feature extraction model. Since the standard MLR has poor generalization capability for the one-shot classes even if these classes have been oversampled, we propose a novel supervision signal called underrepresented-classes promotion loss, which aligns the norms of the weight vectors of the one-shot classes (a.k.a. underrepresented-classes) to those of the normal classes. In addition to the original cross entropy loss, this new loss term effectively promotes the underrepresented classes in the learned model and leads to a remarkable improvement in face recognition performance. We test our solution on the MS-Celeb-1M low-shot learning benchmark task. Our solution recognizes 94.89% of the test images at the precision of 99\% for the one-shot classes. To the best of our knowledge, this is the best performance among all the published methods using this benchmark task with the same setup, including all the participants in the recent MS-Celeb-1M challenge at ICCV 2017.
Motivation & Objective
- Motivate robust face representation learning when training data is highly imbalanced across identities, including one-shot classes.
- Develop a regularized cross-entropy objective to promote discriminative feature directions.
- Introduce UP loss to align weight norms of one-shot classes with base classes for better generalization.
- Provide a two-phase evaluation: learning a strong feature extractor from a base set and a boosted classifier for base plus low-shot classes.
- Publish a reproducible benchmark with base and low-shot sets to facilitate one-shot face recognition research.
Proposed method
- Use a ResNet-34 feature extractor trained with a loss combining standard cross-entropy and a CCS term that aligns feature directions with corresponding class weight vectors.
- CCS loss computes the cosine similarity between each sample’s feature and its class weight vector to encourage aligned directions.
- Drop the bias term in the Softmax classifier to clarify the geometry of the decision space.
- Train a multinomial logistic regression classifier on top of the learned features; standard MLR performances degrade on one-shot classes due to small class partitions.
- Introduce UP loss that penalizes the squared difference between the mean squared norms of base and low-shot class weight vectors, promoting larger, more balanced class partitions for underrepresented classes.
- Compare UP with alternative priors (l2 norm penalty, equal-norm constraint) to evaluate effectiveness.
Experimental results
Research questions
- RQ1Can an angular regularizer (CCS) improve discriminative feature learning for imbalanced face datasets?
- RQ2Does promoting underrepresented-class weight norms (UP loss) improve one-shot class recognition without harming base-class performance?
- RQ3How do CCS and UP interact with standard cross-entropy to enhance one-shot learning on a large-scale face benchmark?
- RQ4What is the impact of imbalanced training data on MLR weight norms and class decision boundaries in face recognition?
- RQ5How does the proposed two-phase approach perform on a MS-Celeb-1M low-shot benchmark comparing to baselines?
Key findings
- With CCS, LFW verification improves to 99.71% (vs 99.28% with CCS loss variant on baseline), indicating stronger feature discrimination.
- Using UP in combination with CCS yields 94.89% coverage at 99% precision and 83.60% at 99.9% precision on low-shot test images, outperforming all listed alternatives.
- Without the UP term, low-shot coverage at 99% precision drops to 25.65% under standard MLR, highlighting the one-shot learning challenge.
- Table 4 shows that CCS+UP achieves the best one-shot performance among reported methods on the MS-Celeb-1M low-shot benchmark in the same setup: 94.89% at 99% and 83.60% at 99.9% precision.
- Base-set performance remains high (Top-1 99.8% in their setup) while improving low-shot recognition.
- The proposed approach establishes a reproducible benchmark and demonstrates significant gains over baseline and several alternative regularization strategies.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.