HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Ziyang Ma Zhisheng Zheng Jiaxin Ye Jinchao Li Zhifu Gao Shiliang Zhang Xie Chen

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Abstract

We propose emotion2vec, a universal speech emotion representation model. emotion2vec is pre-trained on open-source unlabeled emotion data through self-supervised online distillation, combining utterance-level loss and frame-level loss during pre-training. emotion2vec outperforms state-of-the-art pre-trained universal models and emotion specialist models by only training linear layers for the speech emotion recognition task on the mainstream IEMOCAP dataset. In addition, emotion2vec shows consistent improvements among 10 different languages of speech emotion recognition datasets. emotion2vec also shows excellent results on other emotion tasks, such as song emotion recognition, emotion prediction in conversation, and sentiment analysis. Comparison experiments, ablation experiments, and visualization comprehensively demonstrate the universal capability of the proposed emotion2vec. To the best of our knowledge, emotion2vec is the first universal representation model in various emotion-related tasks, filling a gap in the field.

Code Repositories

ddlBoJack/emotion2vec
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-emotion-recognition-on-resdemotion2vec+base
Unweighted Accuracy (UA): 79.8
Weighted Accuracy (WA): 79.4
Weighted F1: 79.4
speech-emotion-recognition-on-resdemotion2vec+large
Unweighted Accuracy (UA): 69.1
Weighted Accuracy (WA): 69.5
Weighted F1: 68.8
speech-emotion-recognition-on-resdemotion2vec
Unweighted Accuracy (UA): 65.04
Weighted Accuracy (WA): 64.75
Weighted F1: 64.53

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation | Papers | HyperAI