HyperAIHyperAI

Command Palette

Search for a command to run...

Mel-frequency Cepstrum MFCCs

Date

a year ago

Mel-Frequency Cepstral Coefficients (MFCCs) is a widely used technology in the field of sound processing, especially in speech recognition and speaker recognition. It was proposed by Davis and Mermelstein in 1980. It is based on the linear transformation of the logarithmic energy spectrum of the nonlinear Mel scale of sound frequency.

Mel-frequency cepstral coefficients (MFCCs) are the coefficients that make up the Mel-frequency cepstral. They are derived from the cepstral spectrum of the audio segment, and the equally spaced frequency bands on the Mel scale are more approximate to the human auditory system than the linearly spaced frequency bands used in the normal logarithmic cepstral spectrum. This nonlinear representation can make the sound signal have a better representation in many fields, such as in audio compression. The calculation process of MFCCs can be roughly divided into the steps of audio file reading, pre-emphasis, framing, windowing, Fourier transform, obtaining the Mel spectrum through the Mel filter bank, and performing cepstral analysis on the Mel spectrum. MFCCs usually contain 12 coefficients, which are superimposed with the frame energy to obtain 13-dimensional coefficients, which are used to describe the characteristics of each frame of speech.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Mel-frequency Cepstrum MFCCs | Wiki | HyperAI