Command Palette
Search for a command to run...
Muqiao Yang Martin Q. Ma Dongyu Li Yao-Hung Hubert Tsai Ruslan Salakhutdinov

Abstract
While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers. However, speech, signal and audio data are naturally complex-valued after Fourier Transform, and studies have shown a potentially richer representation of complex nets. In this paper, we propose a Complex Transformer, which incorporates the transformer model as a backbone for sequence modeling; we also develop attention and encoder-decoder network operating for complex input. The model achieves state-of-the-art performance on the MusicNet dataset and an In-phase Quadrature (IQ) signal dataset.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| music-transcription-on-musicnet | Complex Transformer | APS: 74.22 Number of params: 11.61M |
| music-transcription-on-musicnet | Concatenated Transformer | APS: 71.3 Number of params: 9.79M |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.