HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Yujia Yan; Zhiyao Duan

Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Abstract

The neural semi-Markov Conditional Random Field (semi-CRF) framework has demonstrated promise for event-based piano transcription. In this framework, all events (notes or pedals) are represented as closed time intervals tied to specific event types. The neural semi-CRF approach requires an interval scoring matrix that assigns a score for every candidate interval. However, designing an efficient and expressive architecture for scoring intervals is not trivial. This paper introduces a simple method for scoring intervals using scaled inner product operations that resemble how attention scoring is done in transformers. We show theoretically that, due to the special structure from encoding the non-overlapping intervals, under a mild condition, the inner product operations are expressive enough to represent an ideal scoring matrix that can yield the correct transcription result. We then demonstrate that an encoder-only structured non-hierarchical transformer backbone, operating only on a low-time-resolution feature map, is capable of transcribing piano notes and pedals with high accuracy and time precision. The experiment shows that our approach achieves the new state-of-the-art performance across all subtasks in terms of the F1 measure on the Maestro dataset.

Code Repositories

yujia-yan/skipping-the-frame-level
pytorch
Mentioned in GitHub
yujia-yan/transkun
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
music-transcription-on-maestroTranskun V2 (SemiCRF)
Onset F1: 98.32
music-transcription-on-mapsTranskun V2 (SemiCRF)
Onset F1: 86.1
music-transcription-on-mapsTranskun V2 (SemiCRF) with Data Augmentation
Onset F1: 90.38
music-transcription-on-smd-pianoTranskun V2 (SemiCRF) with Data Augmentation
Onset F1: 98.71

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription | Papers | HyperAI