QUICK REVIEW

[论文解读] Multimodal Emotion Recognition Using Deep Canonical Correlation Analysis

Wei Liu, Jielin Qiu|arXiv (Cornell University)|Aug 13, 2019

Emotion and Mood Recognition参考文献 63被引用 68

一句话总结

本文提出 Deep Canonical Correlation Analysis (DCCA) 用于协调多模态信号的表示，并在五个数据集上展示了情感识别的最先进结果。

ABSTRACT

Multimodal signals are more powerful than unimodal data for emotion recognition since they can represent emotions more comprehensively. In this paper, we introduce deep canonical correlation analysis (DCCA) to multimodal emotion recognition. The basic idea behind DCCA is to transform each modality separately and coordinate different modalities into a hyperspace by using specified canonical correlation analysis constraints. We evaluate the performance of DCCA on five multimodal datasets: the SEED, SEED-IV, SEED-V, DEAP, and DREAMER datasets. Our experimental results demonstrate that DCCA achieves state-of-the-art recognition accuracy rates on all five datasets: 94.58% on the SEED dataset, 87.45% on the SEED-IV dataset, 84.33% and 85.62% for two binary classification tasks and 88.51% for a four-category classification task on the DEAP dataset, 83.08% on the SEED-V dataset, and 88.99%, 90.57%, and 90.67% for three binary classification tasks on the DREAMER dataset. We also compare the noise robustness of DCCA with that of existing methods when adding various amounts of noise to the SEED-V dataset. The experimental results indicate that DCCA has greater robustness. By visualizing feature distributions with t-SNE and calculating the mutual information between different modalities before and after using DCCA, we find that the features transformed by DCCA from different modalities are more homogeneous and discriminative across emotions.

研究动机与目标

通过多模态信号实现对情感的可靠识别的动机与能力。
提出一个协调表示框架（DCCA），通过 CCA 约束学习模态特定的非线性变换。
展示 DCCA 如何通过可调权重融合模态并评估对噪声的鲁棒性。
在五个基准数据集上评估 DCCA，以证明判别性强、鲁棒的情感表示。

提出的方法

通过各自的深度神经网络对每个模态进行变换，产生 O1 和 O2。
在 DCCA 目标下，通过最大化 O1 和 O2 之间的相关性来优化 W1 和 W2。
通过加权和 O = α1 O1 + α2 O2 融合变换后的特征以进行分类。
在融合的 DCCA 特征上训练一个 SVM 分类器。
使用正则化协方差估计和反向传播来计算 DCCA 训练的梯度。
可选地使用 MINE 分析变换后的特征，以估计模态之间的互信息。

实验结果

研究问题

RQ1DCCA 能否学习出对情感多模态数据具有协调性和判别性的非线性表征？
RQ2与其他融合策略相比，DCCA 在基准数据集上的表现如何？
RQ3DCCA 对可能影响一个或多个模态的噪声是否鲁棒？
RQ4通过改变模态权重是否可以在特定任务或数据集上提升融合性能？

主要发现

DCCA 在 SEED(94.58%)、SEED-IV(87.45%)、DEAP(四类任务的88.51%)、SEED-V(83.08%)、DREAMER(三个二元任务分别为 88.99%、90.57%、90.67%) 上实现了最先进的识别准确率。
在不同噪声水平下，DCCA 对 SEED-V 的噪声鲁棒性优于现有方法。
通过 t-SNE 可视化和互信息分析显示，DCCA 转换后的特征在情感间更为同质化且具有判别性。
DCCA 通过调整模态权重实现灵活融合，使每个模态对融合特征的贡献可以不同。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。