HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Automatic Piano Transcription with Hierarchical Frequency-Time Transformer

Keisuke Toyama; Taketo Akama; Yukara Ikemiya; Yuhta Takida; Wei-Hsiang Liao; Yuki Mitsufuji

Automatic Piano Transcription with Hierarchical Frequency-Time Transformer

Abstract

Taking long-term spectral and temporal dependencies into account is essential for automatic piano transcription. This is especially helpful when determining the precise onset and offset for each note in the polyphonic piano content. In this case, we may rely on the capability of self-attention mechanism in Transformers to capture these long-term dependencies in the frequency and time axes. In this work, we propose hFT-Transformer, which is an automatic music transcription method that uses a two-level hierarchical frequency-time Transformer architecture. The first hierarchy includes a convolutional block in the time axis, a Transformer encoder in the frequency axis, and a Transformer decoder that converts the dimension in the frequency axis. The output is then fed into the second hierarchy which consists of another Transformer encoder in the time axis. We evaluated our method with the widely used MAPS and MAESTRO v3.0.0 datasets, and it demonstrated state-of-the-art performance on all the F1-scores of the metrics among Frame, Note, Note with Offset, and Note with Offset and Velocity estimations.

Code Repositories

sony/hft-transformer
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
music-transcription-on-maestrohFT-Transformer
Onset F1: 97.44
music-transcription-on-mapshFT-Transformer
Onset F1: 85.14

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Automatic Piano Transcription with Hierarchical Frequency-Time Transformer | Papers | HyperAI