HyperAIHyperAI

Command Palette

Search for a command to run...

Free Spoken Digit Dataset (FSDD) Digital Recognition Audio Dataset

Date

a year ago

Size

15.67 MB

Publish URL

github.com

License

CC BY-SA 4.0

The Free Spoken Digit Dataset (FSDD) is a simple audio/speech dataset consisting of digital speech recordings in wav files with a sampling rate of 8kHz. The recordings have been cropped to minimize silence at the beginning and end. The dataset is open, meaning it will grow over time as data continues to be contributed.

The FSDD dataset currently includes (as of July 2024):

  • 6 different speakers
  • 3,000 recordings (50 per speaker)
  • English Pronunciation

The files in the dataset are named according to a specific format, for example:{digitLabel}_{speakerName}_{index}.wav For example, the file name 7_jackson_32.wav Indicates the 32nd recording of number 7 by speaker jackson.

The FSDD dataset is not only available for academic research, but the community is also encouraged to contribute their own recordings. All recordings should be mono 8kHz wav files and cropped to minimize silence.

FSDD.torrent
Seeding 1Downloading 0Completed 166Total Downloads 349
  • FSDD/
    • README.md
      1.6 KB
    • README.txt
      3.2 KB
      • data/
        • free-spoken-digit-dataset-master.zip
          15.67 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Free Spoken Digit Dataset (FSDD) Digital Recognition Audio Dataset | Datasets | HyperAI