4 个月前

Yes

Yes

摘要

Human language expression is based on the subjective construal of the situation instead of the objective truth conditions, which means that speakers' personalities and emotions after cognitive processing have an important influence on conversation. However, most existing datasets for conversational AI ignore human personalities and emotions, or only consider part of them. It's difficult for dialogue systems to understand speakers' personalities and emotions although large-scale pre-training language models have been widely used. In order to consider both personalities and emotions in the process of conversation generation, we propose CPED, a large-scale Chinese personalized and emotional dialogue dataset, which consists of multi-source knowledge related to empathy and personal characteristic. These knowledge covers gender, Big Five personality traits, 13 emotions, 19 dialogue acts and 10 scenes. CPED contains more than 12K dialogues of 392 speakers from 40 TV shows. We release the textual dataset with audio features and video features according to the copyright claims, privacy issues, terms of service of video platforms. We provide detailed description of the CPED construction process and introduce three tasks for conversational AI, including personality recognition, emotion recognition in conversations as well as personalized and emotional conversation generation. Finally, we provide baseline systems for these tasks and consider the function of speakers' personalities and emotions on conversation. Our motivation is to propose a dataset to be widely adopted by the NLP community as a new open benchmark for conversational AI research. The full dataset is available at https://github.com/scutcyr/CPED.

代码仓库

scutcyr/CPED
官方
pytorch
GitHub 中提及

基准测试

基准方法指标
emotion-recognition-in-conversation-on-cpedBERT+AVG+MLP
Accuracy of Sentiment: 51.50
Macro-F1 of Sentiment: 48.02
personality-recognition-in-conversation-on-1BERT$_{ssenet}^{c}$
Accuracy (%): 67.25
Accuracy of Agreeableness: 85.89
Accuracy of Conscientiousness: 63.48
Accuracy of Extraversion: 78.21
Accuracy of Neurotism: 53.27
Accuracy of Openness: 55.42
Macro-F1: 74.08
personality-recognition-in-conversation-on-1BERT$^{s}$
Accuracy (%): 67.23
Accuracy of Agreeableness: 85.76
Accuracy of Conscientiousness: 63.60
Accuracy of Extraversion: 78.08
Accuracy of Neurotism: 50.75
Accuracy of Openness: 57.93
Macro-F1: 72.93
personality-recognition-in-conversation-on-1BERT$_{senet}^{c}$
Accuracy (%): 66.02
Accuracy of Agreeableness: 81.99
Accuracy of Conscientiousness: 61.59
Accuracy of Extraversion: 77.71
Accuracy of Neurotism: 53.4
Accuracy of Openness: 55.42
Macro-F1: 71.89
personality-recognition-in-conversation-on-1BERT$^{c}$
Accuracy (%): 66.32
Accuracy of Agreeableness: 80.98
Accuracy of Conscientiousness: 63.35
Accuracy of Extraversion: 78.08
Accuracy of Neurotism: 55.29
Accuracy of Openness: 53.90
Macro-F1: 72.69
personalized-and-emotional-conversation-on{emo+da}-GPT w/o emo
Average Embedding: 0.5564
BLEU: 0.1252
Distinct-1: 0.0451
Distinct-2: 0.2746
Greedy Embedding: 0.4964
PPL: 22.84
bertscore: 0.5666
personalized-and-emotional-conversation-onGPT-{per+emo}
Average Embedding: 0.5617
BLEU: 0.1403
Distinct-1: 0.0602
Distinct-2: 0.3388
Greedy Embedding: 0.5026
PPL: 17.70
bertscore: 0.5719
personalized-and-emotional-conversation-on{emo+da}-GPT
Average Embedding: 0.5552
BLEU: 0.1304
Distinct-1: 0.0476
Distinct-2: 0.2785
Greedy Embedding: 0.4962
PPL: 21.60
bertscore: 0.5674
personalized-and-emotional-conversation-onGPT-{per}
Average Embedding: 0.5606
BLEU: 0.1372
Distinct-1: 0.0592
Distinct-2: 0.3363
Greedy Embedding: 0.5009
PPL: 18.08
bertscore: 0.5715
personalized-and-emotional-conversation-onGPT-{da}
Average Embedding: 0.5610
BLEU: 0.1372
Distinct-1: 0.0605
Distinct-2: 0.3389
Greedy Embedding: 0.5017
PPL: 17.72
bertscore: 0.5703
personalized-and-emotional-conversation-onGPT
Average Embedding: 0.5509
BLEU: 0.1171
Distinct-1: 0.0482
Distinct-2: 0.2738
Greedy Embedding: 0.4922
PPL: 20.07
bertscore: 0.5629
personalized-and-emotional-conversation-on{emo+da}-GPT w/o da
Average Embedding: 0.5556
BLEU: 0.1272
Distinct-1: 0.0473
Distinct-2: 0.2790
Greedy Embedding: 0.4962
PPL: 22.09
bertscore: 0.5669
personalized-and-emotional-conversation-onGPT-{per+emo+da}
Average Embedding: 0.5608
BLEU: 0.1382
Distinct-1: 0.0601
Distinct-2: 0.3404
Greedy Embedding: 05012
PPL: 17.80
bertscore: 0.5722
personalized-and-emotional-conversation-onGPT-{emo}
Average Embedding: 0.5588
BLEU: 0.1342
Distinct-1: 0.0614
Distinct-2: 0.3430
Greedy Embedding: 0.4996
PPL: 17.48
bertscore: 0.5709

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
Yes | 论文 | HyperAI超神经