Command Palette
Search for a command to run...
Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
Shengkui Zhao Trung Hieu Nguyen Bin Ma

Abstract
Deep complex U-Net structure and convolutional recurrent network (CRN) structure achieve state-of-the-art performance for monaural speech enhancement. Both deep complex U-Net and CRN are encoder and decoder structures with skip connections, which heavily rely on the representation power of the complex-valued convolutional layers. In this paper, we propose a complex convolutional block attention module (CCBAM) to boost the representation power of the complex-valued convolutional layers by constructing more informative features. The CCBAM is a lightweight and general module which can be easily integrated into any complex-valued convolutional layers. We integrate CCBAM with the deep complex U-Net and CRN to enhance their performance for speech enhancement. We further propose a mixed loss function to jointly optimize the complex models in both time-frequency (TF) domain and time domain. By integrating CCBAM and the mixed loss, we form a new end-to-end (E2E) complex speech enhancement framework. Ablation experiments and objective evaluations show the superior performance of the proposed approaches (https://github.com/modelscope/ClearerVoice-Studio).
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| speech-enhancement-on-deep-noise-suppression | FRCRN | PESQ-WB: 3.23 |
| speech-enhancement-on-demand | D2Former | PESQ (wb): 3.43 Para. (M): 0.86 |
| speech-enhancement-on-interspeech-2020-deep | DCCRN-M | PESQ-NB: 3.15 |
| speech-enhancement-on-interspeech-2020-deep | DCCRN | PESQ-NB: 3.04 |
| speech-enhancement-on-interspeech-2020-deep | DCCRN-MC | PESQ-NB: 3.21 |
| speech-enhancement-on-wsj0-demand-rnnoise | DCCRN-M | PESQ-NB: 3.28 |
| speech-enhancement-on-wsj0-demand-rnnoise | DCUNet | PESQ-NB: 3.25 |
| speech-enhancement-on-wsj0-demand-rnnoise | DCUNet-MC | PESQ-NB: 3.44 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.