HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SPGM: Prioritizing Local Features for enhanced speech separation performance

SPGM: Prioritizing Local Features for enhanced speech separation performance

Abstract

Dual-path is a popular architecture for speech separation models (e.g. Sepformer) which splits long sequences into overlapping chunks for its intra- and inter-blocks that separately model intra-chunk local features and inter-chunk global relationships. However, it has been found that inter-blocks, which comprise half a dual-path model's parameters, contribute minimally to performance. Thus, we propose the Single-Path Global Modulation (SPGM) block to replace inter-blocks. SPGM is named after its structure consisting of a parameter-free global pooling module followed by a modulation module comprising only 2% of the model's total parameters. The SPGM block allows all transformer layers in the model to be dedicated to local feature modelling, making the overall model single-path. SPGM achieves 22.1 dB SI-SDRi on WSJ0-2Mix and 20.4 dB SI-SDRi on Libri2Mix, exceeding the performance of Sepformer by 0.5 dB and 0.3 dB respectively and matches the performance of recent SOTA models with up to 8 times fewer parameters. Model and weights are available at huggingface.co/yipjiaqi/spgm

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
speech-separation-on-wsj0-2mixSPGM + DM
MACs (G): 77
Number of parameters (M): 26.2
SI-SDRi: 22.7
speech-separation-on-wsj0-2mixSPGM
MACs (G): 77
Number of parameters (M): 26.2
SI-SDRi: 22.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SPGM: Prioritizing Local Features for enhanced speech separation performance | Papers | HyperAI