HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Hybrid Spectrogram and Waveform Source Separation

Alexandre Défossez

Hybrid Spectrogram and Waveform Source Separation

Abstract

Source separation models either work on the spectrogram or waveform domain. In this work, we show how to perform end-to-end hybrid source separation, letting the model decide which domain is best suited for each source, and even combining both. The proposed hybrid version of the Demucs architecture won the Music Demixing Challenge 2021 organized by Sony. This architecture also comes with additional improvements, such as compressed residual branches, local attention or singular value regularization. Overall, a 1.4 dB improvement of the Signal-To-Distortion (SDR) was observed across all sources as measured on the MusDB HQ dataset, an improvement confirmed by human subjective evaluation, with an overall quality rated at 2.83 out of 5 (2.36 for the non hybrid Demucs), and absence of contamination at 3.04 (against 2.37 for the non hybrid Demucs and 2.44 for the second ranking model submitted at the competition).

Code Repositories

facebookresearch/demucs
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
music-source-separation-on-musdb18Hybrid Demucs
SDR (avg): 7.72
SDR (bass): 8.67
SDR (drums): 8.58
SDR (other): 5.59
SDR (vocals): 8.04
music-source-separation-on-musdb18-hqHybrid Demucs
SDR (avg): 7.68
SDR (bass): 8.76
SDR (drums): 8.24
SDR (others): 5.59
SDR (vocals): 8.13

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Hybrid Spectrogram and Waveform Source Separation | Papers | HyperAI