HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Aarne Talman; Antti Suni; Hande Celikkanat; Sofoklis Kakouros; Jörg Tiedemann; Martti Vainio

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Abstract

In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available.

Code Repositories

Helsinki-NLP/prosody
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
prosody-prediction-on-helsinki-prosody-corpusCRF (MarMoT)
Accuracy: 81.8
prosody-prediction-on-helsinki-prosody-corpusBERT
Accuracy: 83.2
prosody-prediction-on-helsinki-prosody-corpusSVN (Minitagger)
Accuracy: 80.8
prosody-prediction-on-helsinki-prosody-corpusBiLSTM
Accuracy: 82.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations | Papers | HyperAI