HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Voice Separation with an Unknown Number of Multiple Speakers

Eliya Nachmani Yossi Adi Lior Wolf

Voice Separation with an Unknown Number of Multiple Speakers

Abstract

We present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

Code Repositories

Mack189/gdprnn
mindspore
Mentioned in GitHub
muhammad-ahmed-ghani/svoice_demo
pytorch
Mentioned in GitHub
facebookresearch/svoice
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-separation-on-whamrVSUNOS
SI-SDRi: 12.2
speech-separation-on-wsj0-2mixGated DualPathRNN
SI-SDRi: 20.12
speech-separation-on-wsj0-3mixGated DualPathRNN
SI-SDRi: 16.85
speech-separation-on-wsj0-4mixGated DualPathRNN
SI-SDRi: 12.88
speech-separation-on-wsj0-5mixGated DualPathRNN
SI-SDRi: 10.56

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Voice Separation with an Unknown Number of Multiple Speakers | Papers | HyperAI