HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Conditional Diffusion Probabilistic Model for Speech Enhancement

Yen-Ju Lu Zhong-Qiu Wang Shinji Watanabe Alexander Richard Cheng Yu Yu Tsao

Conditional Diffusion Probabilistic Model for Speech Enhancement

Abstract

Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs. While generative models have shown strong potential in speech synthesis, they are still lagging behind in speech enhancement. This work leverages recent advances in diffusion probabilistic models, and proposes a novel speech enhancement algorithm that incorporates characteristics of the observed noisy speech signal into the diffusion and reverse processes. More specifically, we propose a generalized formulation of the diffusion probabilistic model named conditional diffusion probabilistic model that, in its reverse process, can adapt to non-Gaussian real noises in the estimated speech signal. In our experiments, we demonstrate strong performance of the proposed approach compared to representative generative models, and investigate the generalization capability of our models to other datasets with noise characteristics unseen during training.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
speech-enhancement-on-ears-whamCDiffuSE
DNSMOS: 2.87
ESTOI: 0.53
PESQ-WB: 1.60
POLQA: 1.81
SI-SDR: 8.35
SIGMOS: 2.08

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Conditional Diffusion Probabilistic Model for Speech Enhancement | Papers | HyperAI