HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

High Fidelity Speech Enhancement with Band-split RNN

Jianwei Yu Yi Luo Hangting Chen Rongzhi Gu Chao Weng

High Fidelity Speech Enhancement with Band-split RNN

Abstract

Despite the rapid progress in speech enhancement (SE) research, enhancing the quality of desired speech in environments with strong noise and interfering speakers remains challenging. In this paper, we extend the application of the recently proposed band-split RNN (BSRNN) model to full-band SE and personalized SE (PSE) tasks. To mitigate the effects of unstable high-frequency components in full-band speech, we perform bi-directional and uni-directional band-level modeling to low-frequency and high-frequency subbands, respectively. For PSE task, we incorporate a speaker enrollment module into BSRNN to utilize target speaker information. Moreover, we utilize a MetricGAN discriminator (MGD) and a multi-resolution spectrogram discriminator (MRSD) to improve perceptual quality metrics. Experimental results show that our system outperforms various top-ranking SE systems, achieves state-of-the-art (SOTA) results on the DNS-2020 test set and ranks among the top 3 in the DNS-2023 challenge.

Code Repositories

sungwon23/bsrnn
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-enhancement-on-deep-noise-suppressionBSRNN-S
PESQ-WB: 3.42
SI-SDR-WB: 21.3
speech-enhancement-on-deep-noise-suppressionBSRNN-S + MGD
PESQ-NB: 3.85
SI-SDR-WB: 21.4
STOI: 98.4
speech-enhancement-on-deep-noise-suppressionBSRNN-16k
PESQ-NB: 3.87
PESQ-WB: 3.45
SI-SDR-WB: 21.1
STOI: 98.3
speech-enhancement-on-deep-noise-suppressionBSRNN
PESQ-NB: 3.79
PESQ-WB: 3.32
STOI: 98
speech-enhancement-on-deep-noise-suppressionBSRNN-S + MRSD
PESQ-NB: 3.89
PESQ-WB: 3.53
SI-SDR-WB: 21.4
STOI: 98.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
High Fidelity Speech Enhancement with Band-split RNN | Papers | HyperAI