HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models

Kirill Borodin Nikita Vasiliev Vasiliy Kudryavtsev Maxim Maslov Mikhail Gorodnichev Oleg Rogov Grach Mkrtchian

A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges
  in Russian Speech Generative Models

Abstract

Russian speech synthesis presents distinctive challenges, including vowelreduction, consonant devoicing, variable stress patterns, homograph ambiguity,and unnatural intonation. This paper introduces Balalaika, a novel datasetcomprising more than 2,000 hours of studio-quality Russian speech withcomprehensive textual annotations, including punctuation and stress markings.Experimental results show that models trained on Balalaika significantlyoutperform those trained on existing datasets in both speech synthesis andenhancement tasks. We detail the dataset construction pipeline, annotationmethodology, and results of comparative evaluations.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Data-Centric Framework for Addressing Phonetic and Prosodic Challenges in Russian Speech Generative Models | Papers | HyperAI