3 months ago

Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

Sung-Feng Huang Shun-Po Chuang Da-Rong Liu Yi-Chen Chen Gene-Ping Yang Hung-yi Lee

Abstract

Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired. In this paper, we propose to perform self-supervised pre-training to stabilize the label assignment in training the speech separation model. Experiments over several types of self-supervised approaches, several typical speech separation models and two different datasets showed that very good improvements are achievable if a proper self-supervised approach is chosen.

Code Repositories

SungFeng-Huang/SSL-pretraining-separation

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
speech-separation-on-libri2mix	Conv-Tasnet (Libri1Mix speech enhancement pre-trained)	SDRi: 14.6 SI-SDRi: 14.1
speech-separation-on-libri2mix	Conv-Tasnet (Libri1Mix speech enhancement multi-task)	SDRi: 14.1 SI-SDRi: 13.7
speech-separation-on-libri2mix	Conv-Tasnet	SDRi: 13.6 SI-SDRi: 13.2
speech-separation-on-wsj0-2mix	DPTNet (Libri1Mix speech enhancement pre-trained)	SDRi: 21.5 SI-SDRi: 21.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette