Speech Recognition On Wsj Eval92

评估指标

Word Error Rate (WER)

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
Jasper 10x36.9Jasper: An End-to-End Convolutional Neural Acoustic Model
CNN over RAW speech (wav)5.6--
CTC-CRF 4gram-LM3.79CRF-based Single-stage Acoustic Modeling with CTC Topology-
test-set on open vocabulary (i.e. harder), model = HMM-DNN + pNorm*3.6--
Deep Speech 23.60Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TC-DNN-BLSTM-DNN3.5Deep Recurrent Neural Networks for Acoustic Modelling-
Convolutional Speech Recognition3.5Fully Convolutional Speech Recognition-
Espresso3.4Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
CTC-CRF VGG-BLSTM3.2CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
Transformer with Relaxed Attention3.19Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
End-to-end LF-MMI3.0End-to-end speech recognition using lattice-free MMI-
CTC-CRF ST-NAS2.77Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
tdnn + chain2.32Purely sequence-trained neural networks for ASR based on lattice-free MMI-
RobustGER2.2It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
Task activating prompting generative correction2.11Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting-
ConformerXXL-P1.3BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition-
Speechstew 100M1.3SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network-
0 of 17 row(s) selected.
Speech Recognition On Wsj Eval92 | SOTA | HyperAI超神经