HyperAIHyperAI

Command Palette

Search for a command to run...

AISHELL-4 Multi-channel Chinese Conference Speech Database

Date

2 years ago

Size

48.38 GB

Organization

AISHELL

AISHELL-4 is a large-scale real-recorded Mandarin speech dataset collected by an 8-channel circular microphone array for speech processing in conference scenarios.The dataset consists of 211 recorded conference sessions, each containing 4 to 8 speakers, with a total duration of 120 hours.The dataset aims to combine advanced research and practical application scenarios of multi-speaker processing from three aspects. Through real recorded meetings, AISHELL-4 provides realistic acoustic effects and rich natural speech features in conversations, such as short pauses, speech overlap, rapid speaker turns, noise, etc. At the same time, AISHELL provides accurate transcriptions and speaker voice activities for each meeting. This enables researchers to explore different aspects of conference processing, from individual tasks such as voice front-end processing, speech recognition, and speaker diarization, to multimodal modeling and joint optimization of related tasks. The research team also released a PyTorch-based training and evaluation framework as a baseline system to promote reproducible research in this field.

AISHELL-4.torrent
Seeding 1Downloading 0Completed 260Total Downloads 523
  • AISHELL-4/
    • README.md
      1.68 KB
    • README.txt
      3.36 KB
      • data/
        • AISHELL-4.zip
          48.38 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AISHELL-4 Multi-channel Chinese Conference Speech Database | Datasets | HyperAI