QUICK REVIEW

[Paper Review] Multi-Task Convolutional Neural Network for Face Recognition.

Xi Yin, Xiaoming Liu|arXiv (Cornell University)|Feb 15, 2017

Face recognition and analysis34 references13 citations

TL;DR

This paper proposes a multi-task CNN for face recognition that jointly learns identity recognition with pose, illumination, and expression estimation as auxiliary tasks. By using dynamic loss weighting and pose-directed feature learning, the model improves generalization and achieves state-of-the-art performance on LFW, CFP, and IJB-A, while being the first to use the full Multi-PIE dataset for training.

ABSTRACT

This paper explores multi-task learning (MTL) for face recognition. We answer the questions of how and why MTL can improve the face recognition performance. First, we propose a multi-task Convolutional Neural Network (CNN) for face recognition where identity recognition is the main task and pose, illumination, and expression estimations are the side tasks. Second, we develop a dynamic-weighting scheme to automatically assign the loss weight to each side task. Third, we propose a pose-directed multi-task CNN by grouping different poses to learn pose-specific identity features, simultaneously across all poses. We observe that the side tasks serve as regularizations to disentangle the variations from the learnt identity features. Extensive experiments on the entire Multi-PIE dataset demonstrate the effectiveness of the proposed approach. To the best of our knowledge, this is the first work using all data in Multi-PIE for face recognition. Our approach is also applicable to in-the-wild datasets for pose-invariant face recognition and we achieve comparable or better performance than state of the art on LFW, CFP, and IJB-A.

Motivation & Objective

To investigate how multi-task learning can improve face recognition performance by leveraging auxiliary tasks.
To address the challenge of domain shift and variation in pose, illumination, and expression in face recognition.
To develop a dynamic loss weighting scheme that automatically balances the contribution of side tasks during training.
To design a pose-directed architecture that learns pose-specific identity features across all poses.
To demonstrate the effectiveness of the approach on both controlled (Multi-PIE) and in-the-wild (LFW, CFP, IJB-A) datasets.

Proposed method

A multi-task CNN is designed with identity recognition as the main task and pose, illumination, and expression estimation as side tasks.
A dynamic-weighting scheme is introduced to automatically adjust the loss contribution of each side task based on training progress.
The pose-directed multi-task CNN groups different poses to learn pose-specific identity representations, improving robustness to pose variation.
The side tasks act as regularizers that disentangle identity-relevant features from variations in pose, illumination, and expression.
The model is trained end-to-end on the entire Multi-PIE dataset, maximizing the use of available data.
The framework is transferable to in-the-wild datasets by fine-tuning on LFW, CFP, and IJB-A benchmarks.

Experimental results

Research questions

RQ1How can multi-task learning improve face recognition performance by leveraging auxiliary tasks?
RQ2What is the optimal way to balance multiple loss functions in a multi-task learning setting for face recognition?
RQ3Can pose-specific feature learning enhance identity representation robustness across varying poses?
RQ4To what extent do side tasks like pose and illumination estimation act as regularizers in disentangling identity features?
RQ5Can the proposed method generalize to in-the-wild datasets and achieve state-of-the-art performance?

Key findings

The proposed multi-task CNN achieves state-of-the-art performance on the LFW, CFP, and IJB-A benchmarks, outperforming or matching existing methods.
The use of all data in the Multi-PIE dataset for training leads to improved generalization and robustness compared to prior approaches.
The dynamic loss weighting scheme effectively balances the contributions of side tasks, improving training stability and performance.
Pose-directed feature learning enables the model to learn identity-invariant representations across diverse poses.
Side tasks such as pose and illumination estimation serve as effective regularizers, reducing overfitting and improving feature disentanglement.
The model demonstrates strong transferability to in-the-wild settings, confirming its practical applicability beyond controlled environments.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.