6 months ago

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino

Abstract

This paper presents XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a million hours of publicly available speech audio in 128 languages, an order of magnitude more public data than the largest known prior work. Our evaluation covers a wide range of tasks, domains, data regimes and languages, both high and low-resource. On the CoVoST-2 speech translation benchmark, we improve the previous state of the art by an average of 7.4 BLEU over 21 translation directions into English. For speech recognition, XLS-R improves over the best known prior work on BABEL, MLS, CommonVoice as well as VoxPopuli, lowering error rates by 14-34% relative on average. XLS-R also sets a new state of the art on VoxLingua107 language identification. Moreover, we show that with sufficient model size, cross-lingual pretraining can outperform English-only pretraining when translating English speech into other languages, a setting which favors monolingual pretraining. We hope XLS-R can help to improve speech processing tasks for many more languages of the world.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Audio and Speech Processing

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Audio and Speech Processing

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino3 more

Abstract

Build AI with AI

HyperAI Newsletters

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino

Arun Babu Changhan Wang Andros Tjandra Kushal Lakhotia Qiantong Xu Naman Goyal Kritika Singh Patrick von Platen Yatharth Saraf Juan Pino