QUICK REVIEW

[论文解读] Age and Gender Prediction From Face Images Using Attentional Convolutional Network

AmirAli Abdolrashidi, Mehdi Minaei|arXiv (Cornell University)|Oct 8, 2020

Face recognition and analysis参考文献 25被引用 27

一句话总结

本文提出了一种结合注意力机制与残差卷积神经网络的集成深度学习框架，用于从人脸图像中联合进行年龄与性别预测。通过利用多任务学习、注意力机制聚焦显著面部区域（如皱纹、面部轮廓），并将预测的性别信息注入年龄分支，该模型在 UTKFace 数据集上实现了最先进性能，年龄准确率达到 91.3%，性别准确率达到 96.5%。

ABSTRACT

Automatic prediction of age and gender from face images has drawn a lot of attention recently, due it is wide applications in various facial analysis problems. However, due to the large intra-class variation of face images (such as variation in lighting, pose, scale, occlusion), the existing models are still behind the desired accuracy level, which is necessary for the use of these models in real-world applications. In this work, we propose a deep learning framework, based on the ensemble of attentional and residual convolutional networks, to predict gender and age group of facial images with high accuracy rate. Using attention mechanism enables our model to focus on the important and informative parts of the face, which can help it to make a more accurate prediction. We train our model in a multi-task learning fashion, and augment the feature embedding of the age classifier, with the predicted gender, and show that doing so can further increase the accuracy of age prediction. Our model is trained on a popular face age and gender dataset, and achieved promising results. Through visualization of the attention maps of the train model, we show that our model has learned to become sensitive to the right regions of the face.

研究动机与目标

解决人脸图像中类内差异大（如光照、姿态、遮挡）对年龄与性别预测准确率的限制。
通过注意力机制聚焦最具信息量的面部区域，提升预测性能。
通过将性别预测作为条件输入注入年龄分支，提升年龄预测准确率。
构建一个联合优化年龄与性别预测的多任务学习框架。
通过注意力图可视化提升模型可解释性，识别预测中使用的显著面部特征。

提出的方法

采用注意力卷积网络（Attn-CNN）动态聚焦于关键面部区域，如眼睛、皱纹与面部轮廓。
将残差网络（ResNet）作为互补主干网络，以增强特征表示学习能力。
通过平均 Attn-CNN 与 ResNet 的预测概率实施集成策略，用于最终分类决策。
实施多任务学习，利用共享卷积特征同时预测年龄与性别。
通过引入预测的性别嵌入向量增强年龄预测分支，以提升年龄估计性能。
在 UTKFace 数据集上端到端训练模型，分类任务使用交叉熵损失，年龄分桶预测任务使用平均绝对差损失。

实验结果

研究问题

RQ1注意力机制是否能通过聚焦最具信息量的面部区域来提升年龄与性别预测性能？
RQ2通过多任务学习联合学习年龄与性别预测，是否能优于单任务学习？
RQ3将预测的性别作为条件信号注入年龄预测分支，是否能进一步提升年龄估计准确率？
RQ4注意力网络与残差网络的模型集成在多大程度上优于单一模型？
RQ5注意力图能否提供模型在年龄与性别预测决策过程中有意义的可视化解释？

主要发现

集成模型在年龄范围分类任务中达到 91.3% 的准确率，在性别分类任务中达到 96.5% 的准确率，优于单一的 Attn-CNN（74.2% 和 55.2%）与 ResNet（90.0% 和 96.5%）模型。
平均年龄分桶绝对差（AABD）降低至 0.11，表明年龄分组估计具有高精度。
模型的注意力图清晰突出了皱纹、眼周轮廓与面部边缘等显著特征，证实模型学习到了对相关区域的关注。
混淆矩阵显示，大多数预测结果落在主对角线上，最高错误率出现在 30–40 岁组图像被误分类为 20–30 岁组。
性别预测的概率分布显示预测具有高置信度，大多数得分集中在极端值（接近 0 或 1），表明不确定性较低。
将性别预测结果整合到年龄分支显著提升了年龄预测准确率，证明了跨任务监督的价值。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。