Command Palette
Search for a command to run...
{Sanjeev Khudanpur Daniel Povey Hossein Sameti Hossein Hadian}

Abstract
We present our work on end-to-end training of acoustic modelsusing the lattice-free maximum mutual information (LF-MMI)objective function in the context of hidden Markov models.By end-to-end training, we mean flat-start training of a singleDNN in one stage without using any previously trained models,forced alignments, or building state-tying decision trees. Weuse full biphones to enable context-dependent modeling without trees, and show that our end-to-end LF-MMI approach canachieve comparable results to regular LF-MMI on well-knownlarge vocabulary tasks. We also compare with other end-to-endmethods such as CTC in character-based and lexicon-free settings and show 5 to 25 percent relative reduction in word error rates on different large vocabulary tasks while using significantly smaller models.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| speech-recognition-on-switchboard-300hr | End-to-end LF-MMI | Word Error Rate (WER): 9.3 |
| speech-recognition-on-wsj-eval92 | End-to-end LF-MMI | Word Error Rate (WER): 3.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.