HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

MANNER: Multi-view Attention Network for Noise Erasure

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

MANNER: Multi-view Attention Network for Noise Erasure

Abstract

In the field of speech enhancement, time domain methods have difficulties in achieving both high performance and efficiency. Recently, dual-path models have been adopted to represent long sequential features, but they still have limited representations and poor memory efficiency. In this study, we propose Multi-view Attention Network for Noise ERasure (MANNER) consisting of a convolutional encoder-decoder with a multi-view attention block, applied to the time-domain signals. MANNER efficiently extracts three different representations from noisy speech and estimates high-quality clean speech. We evaluated MANNER on the VoiceBank-DEMAND dataset in terms of five objective speech quality metrics. Experimental results show that MANNER achieves state-of-the-art performance while efficiently processing noisy speech.

Code Repositories

winddori2002/MANNER
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-enhancement-on-demandMANNER
CBAK: 3.65
COVL: 3.91
CSIG: 4.53
PESQ (wb): 3.21
STOI: 95

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MANNER: Multi-view Attention Network for Noise Erasure | Papers | HyperAI