HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

WaveNet: A Generative Model for Raw Audio

Aaron van den Oord; Sander Dieleman; Heiga Zen; Karen Simonyan; Oriol Vinyals; Alex Graves; Nal Kalchbrenner; Andrew Senior; Koray Kavukcuoglu

WaveNet: A Generative Model for Raw Audio

Abstract

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio. When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly more natural sounding than the best parametric and concatenative systems for both English and Mandarin. A single WaveNet can capture the characteristics of many different speakers with equal fidelity, and can switch between them by conditioning on the speaker identity. When trained to model music, we find that it generates novel and often highly realistic musical fragments. We also show that it can be employed as a discriminative model, returning promising results for phoneme recognition.

Code Repositories

outofculture/talk-like-me
pytorch
Mentioned in GitHub
ShuSQ/CCI_AP_PoseLoops
tf
Mentioned in GitHub
awslabs/gluon-ts
mxnet
Mentioned in GitHub
ZTianle/keras-tcn-solar
tf
Mentioned in GitHub
Talk2Levi/DJL
tf
Mentioned in GitHub
zll1996/TCN
tf
Mentioned in GitHub
zhong110020/keras-tcn
tf
Mentioned in GitHub
MSRDL/Deep4Cast
pytorch
Mentioned in GitHub
karpathy/makemore
pytorch
Mentioned in GitHub
peustr/wavenet
Mentioned in GitHub
r9y9/wavenet
Mentioned in GitHub
yebiny/DepthOfAnaesthesia_eeg
tf
Mentioned in GitHub
HaiFengZeng/clari_wavenet_vocoder
pytorch
Mentioned in GitHub
zhong110020/Tensorflow-TCN
tf
Mentioned in GitHub
PhilippeNguyen/keras_wavenet
tf
Mentioned in GitHub
ShotDownDiane/tcn-master
tf
Mentioned in GitHub
isadrtdinov/wavenet
pytorch
Mentioned in GitHub
AI-Huang/WaveNet
pytorch
Mentioned in GitHub
RamsteinWR/wavenet-master
tf
Mentioned in GitHub
Baichenjia/Tensorflow-TCN
tf
Mentioned in GitHub
albarji/neurowriter
tf
Mentioned in GitHub
TanUkkii007/wavenet
tf
Mentioned in GitHub
vicky-hnk/time-flex
pytorch
Mentioned in GitHub
otosense/slang
Mentioned in GitHub
thorwhalen/sla
Mentioned in GitHub
ashishpatel26/tcn-keras-Examples
pytorch
Mentioned in GitHub
imdatsolak/wavenet
tf
Mentioned in GitHub
Shivendra-psc/speechbot
tf
Mentioned in GitHub
benmoseley/simple-wavenet
tf
Mentioned in GitHub
Chasm4359/ProTS
pytorch
Mentioned in GitHub
pbrandl/aNN_Audio
pytorch
Mentioned in GitHub
rampage644/wavenet
tf
Mentioned in GitHub
ShichengChen/WaveNetSeparateAudio
pytorch
Mentioned in GitHub
Gal1eo/DT2119
pytorch
Mentioned in GitHub
PeihaoChen/regnet
pytorch
Mentioned in GitHub
liguigui/speech-to-text-wavenet
tf
Mentioned in GitHub
thorwhalen/slang
Mentioned in GitHub
Vikas-Sony/speech-to-text
tf
Mentioned in GitHub
vincentherrmann/pytorch-wavenet
pytorch
Mentioned in GitHub
swasun/VQ-VAE-Speech
pytorch
Mentioned in GitHub
basveeling/wavenet
tf
Mentioned in GitHub
coreyoconnor/tensorderp
tf
Mentioned in GitHub
glakshay/Generating-audio-DL
tf
Mentioned in GitHub
WLM1ke/poptimizer
pytorch
Mentioned in GitHub
anandharaju/Basic_TCN
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-synthesis-on-mandarin-chineseLSTM-RNN parametric
Mean Opinion Score: 3.79
speech-synthesis-on-mandarin-chineseHMM-driven concatenative
Mean Opinion Score: 3.47
speech-synthesis-on-mandarin-chineseWaveNet (L+F)
Mean Opinion Score: 4.08
speech-synthesis-on-north-american-englishLSTM-RNN parametric
Mean Opinion Score: 3.67
speech-synthesis-on-north-american-englishWaveNet (L+F)
Mean Opinion Score: 4.21
speech-synthesis-on-north-american-englishHMM-driven concatenative
Mean Opinion Score: 3.86

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
WaveNet: A Generative Model for Raw Audio | Papers | HyperAI