HyperAI

Abstract

In the field of speech enhancement, time domain methods have difficulties in achieving both high performance and efficiency. Recently, dual-path models have been adopted to represent long sequential features, but they still have limited representations and poor memory efficiency. In this study, we propose Multi-view Attention Network for Noise ERasure (MANNER) consisting of a convolutional encoder-decoder with a multi-view attention block, applied to the time-domain signals. MANNER efficiently extracts three different representations from noisy speech and estimates high-quality clean speech. We evaluated MANNER on the VoiceBank-DEMAND dataset in terms of five objective speech quality metrics. Experimental results show that MANNER achieves state-of-the-art performance while efficiently processing noisy speech.

Abstract

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

MANNER: Multi-view Attention Network for Noise Erasure

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

MANNER: Multi-view Attention Network for Noise Erasure

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

MANNER: Multi-view Attention Network for Noise Erasure

Hyun Joon Park Byung Ha Kang Wooseok Shin Jin Sob Kim Sung Won Han

Abstract

Build AI with AI

HyperAI Newsletters