HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation

Woosung Choi Minseok Kim Jaehwa Chung Soonyoung Jung

LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation

Abstract

Recent deep-learning approaches have shown that Frequency Transformation (FT) blocks can significantly improve spectrogram-based single-source separation models by capturing frequency patterns. The goal of this paper is to extend the FT block to fit the multi-source task. We propose the Latent Source Attentive Frequency Transformation (LaSAFT) block to capture source-dependent frequency patterns. We also propose the Gated Point-wise Convolutional Modulation (GPoCM), an extension of Feature-wise Linear Modulation (FiLM), to modulate internal features. By employing these two novel methods, we extend the Conditioned-U-Net (CUNet) for multi-source separation, and the experimental results indicate that our LaSAFT and GPoCM can improve the CUNet's performance, achieving state-of-the-art SDR performance on several MUSDB18 source separation tasks.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
music-source-separation-on-musdb18LaSAFT+GPoCM
SDR (avg): 5.88
SDR (bass): 5.63
SDR (drums): 5.68
SDR (other): 4.87
SDR (vocals): 7.33

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation | Papers | HyperAI