5 months ago

SepIt: Approaching a Single Channel Speech Separation Bound

Shahar Lutati; Eliya Nachmani; Lior Wolf

Abstract

We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made significant progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state-of-the-art neural networks for 2, 3, 5, and 10 speakers.

Benchmarks

Benchmark	Methodology	Metrics
speech-separation-on-libri10mix	SepIt	SI-SDRi: 8.2
speech-separation-on-libri5mix	SepIt	SI-SDRi: 13.7
speech-separation-on-wsj0-2mix	SepIt	SI-SDRi: 22.4
speech-separation-on-wsj0-3mix	SepIt	SI-SDRi: 20.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning