HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Kühne Nikolai Lund ; Østergaard Jan ; Jensen Jesper ; Tan Zheng-Hua

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement

Abstract

While attention-based architectures, such as Conformers, excel in speechenhancement, they face challenges such as scalability with respect to inputsequence length. In contrast, the recently proposed Extended Long Short-TermMemory (xLSTM) architecture offers linear scalability. However, xLSTM-basedmodels remain unexplored for speech enhancement. This paper introducesxLSTM-SENet, the first xLSTM-based single-channel speech enhancement system. Acomparative analysis reveals that xLSTM-and notably, even LSTM-can match oroutperform state-of-the-art Mamba- and Conformer-based systems across variousmodel sizes in speech enhancement on the VoiceBank+Demand dataset. Throughablation studies, we identify key architectural design choices such asexponential gating and bidirectionality contributing to its effectiveness. Ourbest xLSTM-based model, xLSTM-SENet2, outperforms state-of-the-art Mamba- andConformer-based systems of similar complexity on the Voicebank+DEMAND dataset.

Code Repositories

nikolaikyhne/xlstm-senet
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-enhancement-on-demandxLSTM-SENet2
CBAK: 3.98
COVL: 4.27
CSIG: 4.78
PESQ (wb): 3.53
Para. (M): 2.27
STOI: 0.96

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement | Papers | HyperAI