HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Singing Voice Separation with Deep U-Net Convolutional Networks

{Tillman Weyde Aparna Kumar Rachel Bittner Nicola Montecchio Eric Humphrey Andreas Jansson}

Abstract

The decomposition of a music audio signal into its vocal and backing track components is analogous to image-toimage translation, where a mixed spectrogram is transformed into its constituent sources. We propose a novel application of the U-Net architecture — initially developed for medical imaging — for the task of source separation, given its proven capacity for recreating the fine, low-level detail required for high-quality audio reproduction. Through both quantitative evaluation and subjective assessment, experiments demonstrate that the proposed algorithm achieves state-of-the-art performance.

Benchmarks

BenchmarkMethodologyMetrics
speech-separation-on-ikalaU-Net
NSDR: 11.094 (Vocal); 14.435 (Instrumental)

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Singing Voice Separation with Deep U-Net Convolutional Networks | Papers | HyperAI