QUICK REVIEW

[论文解读] Frame-level Prediction of Facial Expressions, Valence, Arousal and Action Units for Mobile Devices

Andrey V. Savchenko|arXiv (Cornell University)|Mar 25, 2022

Emotion and Mood Recognition被引用 23

一句话总结

提出了一个轻量级的帧级人脸情绪分析模型，使用在 AffectNet 上预训练的 EfficientNet 来预测表情、价态/唤醒和动作单元，适用于设备端移动处理；在 Aff-Wild2 上的 ABAW3 挑战中显示出具竞争力的结果。

ABSTRACT

In this paper, we consider the problem of real-time video-based facial emotion analytics, namely, facial expression recognition, prediction of valence and arousal and detection of action unit points. We propose the novel frame-level emotion recognition algorithm by extracting facial features with the single EfficientNet model pre-trained on AffectNet. As a result, our approach may be implemented even for video analytics on mobile devices. Experimental results for the large scale Aff-Wild2 database from the third Affective Behavior Analysis in-the-wild (ABAW) Competition demonstrate that our simple model is significantly better when compared to the VggFace baseline. In particular, our method is characterized by 0.15-0.2 higher performance measures for validation sets in uni-task Expression Classification, Valence-Arousal Estimation and Expression Classification. Due to simplicity, our approach may be considered as a new baseline for all four sub-challenges.

研究动机与目标

在移动和嵌入式系统上推动实时、设备端人脸情绪分析。
开发一个单一的轻量级 CNN 基于管线，能够在不使用集成的情况下执行多项情感任务。
利用预训练的人脸表示在不同数据集之间泛化并降低计算开销。
展示在 EfficientNet 特征之上使用简单的 MLP 头可以在单任务和多任务上实现强性能。

提出的方法

在大规模人脸识别数据（VGGFace2）上预训练一个轻量级 CNN，以学习通用的人脸特征。
在 AffectNet 上对 CNN 进行微调，使其具备对八种基本表情的情感特征提取能力。
从微调后的网络中提取每个视频帧的帧级嵌入和表达分数。
使用嵌入和/或分数作为特征，训练浅层 MLP 基分类器/回归器（每个任务一个）。
可选地对滑动窗口内的帧应用均值或中值滤波进行平滑，以提高稳定性。

实验结果

研究问题

RQ1在帧级、设备端设置下，基于 EfficientNet 的单一轻量模型是否能有效覆盖四个 ABAW3 子挑战（FER、AU、Valence-Arousal）？
RQ2嵌入通常是否比情感分数作为特征更具预测力，将两者连接起来是否对多任务性能有益？
RQ3对每个任务的帧级预测，平滑的影响是多少？
RQ4在情感预测的多任务学习中，简单的 MLP 头与更复杂的多任务网络相比有何差异？

主要发现

单一基于 EfficientNet 的特征提取器配合简单的 MLP 头，在 ABAW3 任务上可以超越基线。
嵌入通常比情感分数在表情/ AU 预测上表现更好，将嵌入与分数拼接可以提供强劲的性能。
使用较大窗口的帧级平滑（如 k=15）在 Valence/Arousal 和 AU 指标上带来显著提升。
所提出的方法取得了有竞争力的结果，在表达、AU 和 VA 任务的若干指标上对基线有平均改进。
在多任务学习中，EfficientNet-B0 搭配简单的逻辑回归头在验证/测试指标上总体表现最好，并且该方法在挑战参赛作品中名列前茅。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。