HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

AERO: Audio Super Resolution in the Spectral Domain

Mandel Moshe ; Tal Or ; Adi Yossi

AERO: Audio Super Resolution in the Spectral Domain

Abstract

We present AERO, a audio super-resolution model that processes speech andmusic signals in the spectral domain. AERO is based on an encoder-decoderarchitecture with U-Net like skip connections. We optimize the model using bothtime and frequency domain loss functions. Specifically, we consider a set ofreconstruction losses together with perceptual ones in the form of adversarialand feature discriminator loss functions. To better handle phase informationthe proposed method operates over the complex-valued spectrogram using twoseparate channels. Unlike prior work which mainly considers low and highfrequency concatenation for audio super-resolution, the proposed methoddirectly predicts the full frequency range. We demonstrate high performanceacross a wide range of sample rates considering both speech and music. AEROoutperforms the evaluated baselines considering Log-Spectral Distance, ViSQOL,and the subjective MUSHRA test. Audio samples and code are available athttps://pages.cs.huji.ac.il/adiyoss-lab/aero

Code Repositories

slp-rl/aero
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
bandwidth-extension-on-vctkAERO
LSD: 0.77

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
AERO: Audio Super Resolution in the Spectral Domain | Papers | HyperAI