HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Tracing Intricate Cues in Dialogue: Joint Graph Structure and Sentiment Dynamics for Multimodal Emotion Recognition

Li Jiang ; Wang Xiaoping ; Zeng Zhigang

Tracing Intricate Cues in Dialogue: Joint Graph Structure and Sentiment
  Dynamics for Multimodal Emotion Recognition

Abstract

Multimodal emotion recognition in conversation (MERC) has garneredsubstantial research attention recently. Existing MERC methods face severalchallenges: (1) they fail to fully harness direct inter-modal cues, possiblyleading to less-than-thorough cross-modal modeling; (2) they concurrentlyextract information from the same and different modalities at each networklayer, potentially triggering conflicts from the fusion of multi-source data;(3) they lack the agility required to detect dynamic sentimental changes,perhaps resulting in inaccurate classification of utterances with abruptsentiment shifts. To address these issues, a novel approach named GraphSmile isproposed for tracking intricate emotional cues in multimodal dialogues.GraphSmile comprises two key components, i.e., GSF and SDP modules. GSFingeniously leverages graph structures to alternately assimilate inter-modaland intra-modal emotional dependencies layer by layer, adequately capturingcross-modal cues while effectively circumventing fusion conflicts. SDP is anauxiliary task to explicitly delineate the sentiment dynamics betweenutterances, promoting the model's ability to distinguish sentimentaldiscrepancies. Furthermore, GraphSmile is effortlessly applied to multimodalsentiment analysis in conversation (MSAC), forging a unified multimodalaffective model capable of executing MERC and MSAC tasks. Empirical results onmultiple benchmarks demonstrate that GraphSmile can handle complex emotionaland sentimental patterns, significantly outperforming baseline models.

Code Repositories

lijfrank-open/GraphSmile
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
emotion-recognition-in-conversation-onGraphSmile
Accuracy: 72.77
Weighted-F1: 72.81
emotion-recognition-in-conversation-on-7GraphSmile
Accuracy: 86.53
Weighted F1: 86.52
emotion-recognition-in-conversation-on-cmu-2GraphSmile
Accuracy: 46.82
Weighted F1: 44.93
emotion-recognition-in-conversation-on-cmu-3GraphSmile
Accuracy: 67.73
Weighted F1: 66.73
emotion-recognition-in-conversation-on-meldGraphSmile
Accuracy: 67.70
Weighted-F1: 66.71
emotion-recognition-in-conversation-on-meld-1GraphSmile
Accuracy: 74.44
Weighted F1: 74.31
multimodal-emotion-recognition-on-cmu-mosei-1GraphSmile
Accuracy: 46.82
Weighted F1: 44.93
multimodal-emotion-recognition-on-cmu-mosei-2GraphSmile
Accuracy: 67.73
Weighted F1: 66.73
multimodal-emotion-recognition-on-iemocapGraphSmile
Accuracy: 72.77
Weighted F1: 72.81
multimodal-emotion-recognition-on-iemocap-4GraphSmile
Accuracy: 86.53
Weighted F1: 86.52
multimodal-emotion-recognition-on-meldGraphSmile
Accuracy: 67.70
Weighted F1: 66.71
multimodal-emotion-recognition-on-meld-1GraphSmile
Accuracy: 74.44
Weighted F1: 74.31

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Tracing Intricate Cues in Dialogue: Joint Graph Structure and Sentiment Dynamics for Multimodal Emotion Recognition | Papers | HyperAI