Date

9 months ago

Size

440.73 MB

Organization

Paper URL

2507.16632

License

Apache 2.0

Tags

Audio and Speech Processing

StepEval Audio Paralinguistic is an audio paralinguistic understanding evaluation dataset released by the StepFun AI team in 2025. The related paper is "Step-Audio 2 Technical Report", which aims to evaluate the ability of AI models to understand paralinguistic information (such as gender, age, tone, emotions, etc.) in speech. This dataset consists of 550 speech samples, evenly distributed across 11 task dimensions: gender, age, timbre, emotion, pitch, rhythm, speed, speaking style, vocal activity, scenario, and event type. The first eight tasks are based on Chinese audio clips sampled from 400 public podcasts, while the last three tasks use 50 audio samples each from AudioSet (events), CochlScene (environmental scenes), and VocalSound (vocal sound effects). All samples are kept under 30 seconds in length, uniformly resampled to 24 kHz, and annotated by a professional team.

StepEval-Audio-Paralinguistic.torrent

Seeding 1Downloading 0Completed 24Total Downloads 144

StepEval-Audio-Paralinguistic/
- README.md
  1.77 KB
- README.txt
  3.54 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at support@hyper.ai for prompt review and removal.