QUICK REVIEW

[논문 리뷰] Multimodal Emotion Recognition Using Deep Canonical Correlation Analysis

Wei Liu, Jielin Qiu|arXiv (Cornell University)|2019. 08. 13.

Emotion and Mood Recognition참고 문헌 63인용 수 68

한 줄 요약

논문은 다중 모달 신호의 표현을 조정하기 위한 Deep Canonical Correlation Analysis (DCCA)를 도입하고 다섯 데이터 세트에서 최첨단 감정 인식을 입증합니다.

ABSTRACT

Multimodal signals are more powerful than unimodal data for emotion recognition since they can represent emotions more comprehensively. In this paper, we introduce deep canonical correlation analysis (DCCA) to multimodal emotion recognition. The basic idea behind DCCA is to transform each modality separately and coordinate different modalities into a hyperspace by using specified canonical correlation analysis constraints. We evaluate the performance of DCCA on five multimodal datasets: the SEED, SEED-IV, SEED-V, DEAP, and DREAMER datasets. Our experimental results demonstrate that DCCA achieves state-of-the-art recognition accuracy rates on all five datasets: 94.58% on the SEED dataset, 87.45% on the SEED-IV dataset, 84.33% and 85.62% for two binary classification tasks and 88.51% for a four-category classification task on the DEAP dataset, 83.08% on the SEED-V dataset, and 88.99%, 90.57%, and 90.67% for three binary classification tasks on the DREAMER dataset. We also compare the noise robustness of DCCA with that of existing methods when adding various amounts of noise to the SEED-V dataset. The experimental results indicate that DCCA has greater robustness. By visualizing feature distributions with t-SNE and calculating the mutual information between different modalities before and after using DCCA, we find that the features transformed by DCCA from different modalities are more homogeneous and discriminative across emotions.

연구 동기 및 목표

다중 모달 데이터에서 unimodal 데이터를 넘어서 신뢰할 수 있는 감정 인식을 가능하게 하고 동기를 부여합니다.
CCA 제약 조건으로 모달리티별 비선형 변환을 학습하는 조정된 표현 프레임워크(DCCA)를 제안합니다.
DCCA가 가중치를 조정하여 모달리티를 융합하고 잡음에 대한 강건성을 평가하는 방법을 보여줍니다.
다섯 개의 벤치마크 데이터셋에서 DCCA를 평가하여 판별력 있고 강건한 감정 표현을 입증합니다.

제안 방법

각 모달리티를 별도의 심층 신경망을 통해 변환하여 O1 및 O2를 생성합니다.
DCCA 목적 하에서 O1과 O2 간의 상관도를 최대화하여 W1과 W2를 최적화합니다.
가중 합 O = α1 O1 + α2 O2로 변환 특징을 융합하여 분류합니다.
융합된 DCCA 특징에서 SVM 분류기를 학습합니다.
정규화된 공분산 추정과 역전파를 사용하여 DCCA 학습의 그래디언트를 계산합니다.
선택적으로 모달리티 간 상호 정보를 추정하기 위해 MINE을 사용하여 변환된 특징을 분석합니다.

실험 결과

연구 질문

RQ1DCCA가 다중 모달 감정 데이터에 대해 조정되고 판별력 있는 비선형 표현을 학습할 수 있습니까?
RQ2벤치마크 데이터 세트에서 다른 융합 전략과 비교하여 DCCA의 성능은 어떠합니까?
RQ3하나 이상의 모달리티에 영향을 주는 잡음에 대해 DCCA가 강인합니까?
RQ4특정 작업이나 데이터 세트에 대해 모달리티 가중치를 다르게 조정하면 융합 성능이 향상됩니까?

주요 결과

DCCA는 SEED(94.58%), SEED-IV(87.45%), DEAP(네 분류 작업에서 88.51%), SEED-V(83.08%), DREAMER(세 이진 작업에서 각각 88.99%, 90.57%, 90.67%)에서 최첨단 인식 정확도를 달성합니다.
다양한 잡음 수준에서 SEED-V에서 기존 방법보다 DCCA가 잡음에 대한 강건성이 더 큽니다.
DCCA를 통해 변환된 특징은 t-SNE 시각화와 상호 정보 분석에서 감정 간 더 균질하고 판별적임을 보입니다.
DCCA는 모달리티 가중치를 조정하여 융합을 유연하게 할 수 있어 각 모달리티가 융합 특징에 다른 기여를 하도록 합니다.

더 나은 연구,지금 바로 시작하세요

연구 설계부터 논문 작성까지, 연구 시간을 획기적으로 줄여보세요.

카드 등록 없음 · 무료 플랜 제공

이 리뷰는 AI가 만들고, 인간 에디터가 검토했습니다.