HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Speech Emotion Recognition with Multi-Task Learning

{Kenneth Liang Church Renjie Huang Jiahong Zheng Xingyu Yuan Cai}

Abstract

Speech emotion recognition (SER) classifies speech into emotion categories such as: Happy, Angry, Sad and Neutral. Recently , deep learning has been applied to the SER task. This paper proposes a multi-task learning (MTL) framework to simultaneously perform speech-to-text recognition and emotion classification, with an end-to-end deep neural model based on wav2vec-2.0. Experiments on the IEMOCAP benchmark show that the proposed method achieves the state-of-the-art performance on the SER task. In addition, an ablation study establishes the effectiveness of the proposed MTL framework.

Benchmarks

BenchmarkMethodologyMetrics
speech-emotion-recognition-on-iemocapSER with MTL
F1: -
UA CV: 0.7815

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Speech Emotion Recognition with Multi-Task Learning | Papers | HyperAI