Command Palette
Search for a command to run...
StepEval Audio Paralinguistic Paralinguistic Understanding Evaluation Dataset
Date
Size
Paper URL
License
Apache 2.0
StepEval Audio Paralinguistic is an audio paralinguistic understanding evaluation dataset released by the StepFun AI team in 2025. The related paper is "Step-Audio 2 Technical Report", which aims to evaluate the ability of AI models to understand paralinguistic information (such as gender, age, tone, emotions, etc.) in speech.
This dataset consists of 550 speech samples, evenly distributed across 11 task dimensions: gender, age, timbre, emotion, pitch, rhythm, speed, speaking style, vocal activity, scenario, and event type. The first eight tasks are based on Chinese audio clips sampled from 400 public podcasts, while the last three tasks use 50 audio samples each from AudioSet (events), CochlScene (environmental scenes), and VocalSound (vocal sound effects). All samples are kept under 30 seconds in length, uniformly resampled to 24 kHz, and annotated by a professional team.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.