HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Alexei Baevski Henry Zhou Abdelrahman Mohamed Michael Auli

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Abstract

We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler. wav2vec 2.0 masks the speech input in the latent space and solves a contrastive task defined over a quantization of the latent representations which are jointly learned. Experiments using all labeled data of Librispeech achieve 1.8/3.3 WER on the clean/other test sets. When lowering the amount of labeled data to one hour, wav2vec 2.0 outperforms the previous state of the art on the 100 hour subset while using 100 times less labeled data. Using just ten minutes of labeled data and pre-training on 53k hours of unlabeled data still achieves 4.8/8.2 WER. This demonstrates the feasibility of speech recognition with limited amounts of labeled data.

Code Repositories

neonbjb/ocotillo
pytorch
Mentioned in GitHub
vasudevgupta7/gsoc-wav2vec2
tf
Mentioned in GitHub
pytorch/fairseq
Official
pytorch
facebookresearch/brainmagick
pytorch
Mentioned in GitHub
liutianlin0121/seislm
pytorch
Mentioned in GitHub
gatech-eic/s3-router
pytorch
Mentioned in GitHub
eastonYi/wav2vec
pytorch
Mentioned in GitHub
HarunoriKawano/Wav2vec2.0
pytorch
Mentioned in GitHub
nlp-en-es/wav2vec2-spanish
jax
Mentioned in GitHub
phanxuanphucnd/Arizona-asr
pytorch
Mentioned in GitHub
AIdeaLab/wav2vec2_docker
pytorch
Mentioned in GitHub
mailong25/vietnamese-speech-recognition
pytorch
Mentioned in GitHub
huggingface/transformers
pytorch
Mentioned in GitHub
Arizona-Voice/Arizona-spotting
pytorch
Mentioned in GitHub
huseinzol05/malaya-speech
tf
Mentioned in GitHub
shivangi-aneja/FaceTalk
pytorch
Mentioned in GitHub
phanxuanphucnd/wav2asr
Mentioned in GitHub
BirgerMoell/tmh
pytorch
Mentioned in GitHub
JoungheeKim/Non-Attentive-Tacotron
pytorch
Mentioned in GitHub
sh-lee-prml/hierspeechpp
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-recognition-on-libri-light-test-cleanwav2vec 2.0 Large-10h-LV-60k
Word Error Rate (WER): 2.5
speech-recognition-on-libri-light-test-otherwav2vec 2.0 Large-10h-LV-60k
Word Error Rate (WER): 5.0
speech-recognition-on-librispeech-test-cleanwav2vec 2.0 with Libri-Light
Word Error Rate (WER): 1.8
speech-recognition-on-librispeech-test-otherwav2vec 2.0 with Libri-Light
Word Error Rate (WER): 3.0
speech-recognition-on-librispeech-test-otherwav2vec 2.0
Word Error Rate (WER): 4.1
speech-recognition-on-timitwav2vec 2.0
Percentage error: 8.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations | Papers | HyperAI