Skip to main content
QUICK REVIEW

[论文解读] End-to-end Multimodal Emotion and Gender Recognition with Dynamic Weights of Joint Loss

Myungsu Chae, Taeho Kim|arXiv (Cornell University)|Sep 4, 2018
Emotion and Mood Recognition被引用 3
一句话总结

本文提出了一种用于端到端多模态情感与性别识别中联合损失的动态加权策略,通过在训练过程中自适应地平衡特定任务的损失,提升了整体模型性能。该方法在联合损失方面表现更优,泛化能力也优于静态加权方法。

ABSTRACT

Multi-task learning is a method for improving the generalizability of multiple tasks. In order to perform multiple classification tasks with one neural network model, the losses of each task should be combined. Previous studies have mostly focused on multiple prediction tasks using joint loss with static weights for training models, choosing the weights between tasks without making sufficient considerations by setting them uniformly or empirically. In this study, we propose a method to calculate joint loss using dynamic weights to improve the total performance, instead of the individual performance, of tasks. We apply this method to design an end-to-end multimodal emotion and gender recognition model using audio and video data. This approach provides proper weights for the loss of each task when the training process ends. In our experiments, emotion and gender recognition with the proposed method yielded a lower joint loss, which is computed as the negative log-likelihood, than using static weights for joint loss. Moreover, our proposed model has better generalizability than other models. To the best of our knowledge, this research is the first to demonstrate the strength of using dynamic weights for joint loss for maximizing overall performance in emotion and gender recognition tasks.

研究动机与目标

  • 为解决在情感与性别识别任务中静态损失加权在多任务学习中的局限性。
  • 通过在训练过程中动态调整损失权重,提升整体模型性能。
  • 提升多模态(音频与视频)情感与性别识别任务中的泛化能力。
  • 展示动态损失加权在联合学习场景中的有效性。

提出的方法

  • 该方法采用一种神经网络架构,联合从音频和视频输入中预测情感与性别。
  • 其使用一种动态损失加权机制,根据训练进度在训练过程中调整每个任务损失的贡献。
  • 联合损失通过各任务损失的加权和计算得出,其中权重动态更新以平衡任务优化。
  • 动态权重被推导以最小化总联合损失,该损失定义为联合预测结果的负对数似然。
  • 模型在多模态数据上进行端到端训练,使特征学习与损失优化在统一框架中完成。

实验结果

研究问题

  • RQ1动态损失加权能否提升多模态情感与性别识别模型的整体性能?
  • RQ2在联合损失与泛化能力方面,动态损失加权与静态加权相比表现如何?
  • RQ3所提出的方法是否增强了模型在情感与性别分类任务中的鲁棒性与性能?

主要发现

  • 所提出方法在联合损失方面优于使用静态损失权重的模型。
  • 与使用静态加权的基线模型相比,该模型展现出更优的泛化能力。
  • 动态损失加权有效平衡了任务优化,提升了整体性能,而非单一任务的性能。
  • 据作者所知,这是首个将动态损失加权应用于情感与性别识别中联合优化的研究。

更好的研究,从现在开始

从论文设计到论文写作,大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成,并经人工编辑审核。