Speech Recognition On Switchboard Hub500
评估指标
Percentage error
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| Deep Speech | 20 | Deep Speech: Scaling up end-to-end speech recognition | |
| DNN-HMM | 18.5 | - | - |
| CD-DNN | 16.1 | - | - |
| DNN | 16 | Building DNN Acoustic Models for Large Vocabulary Speech Recognition | |
| DNN + Dropout | 15 | Building DNN Acoustic Models for Large Vocabulary Speech Recognition | |
| DNN MMI | 12.9 | - | - |
| HMM-TDNN + pNorm + speed up/down speech | 12.9 | - | - |
| DNN MPE | 12.9 | - | - |
| DNN BMMI | 12.9 | - | - |
| Deep Speech + FSH | 12.6 | Deep Speech: Scaling up end-to-end speech recognition | |
| HMM-DNN +sMBR | 12.6 | - | - |
| CNN + Bi-RNN + CTC (speech to letters), 25.9% WER if trainedonlyon SWB | 12.6 | Deep Speech: Scaling up end-to-end speech recognition | |
| DNN sMBR | 12.6 | - | - |
| Deep CNN (10 conv, 4 FC layers), multi-scale feature maps | 12.2 | Very Deep Multilingual Convolutional Neural Networks for LVCSR | - |
| CNN | 11.5 | - | - |
| HMM-TDNN + iVectors | 11 | - | - |
| CNN on MFSC/fbanks + 1 non-conv layer for FMLLR/I-Vectors concatenated in a DNN | 10.4 | - | - |
| HMM-TDNN trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher (10% / 15.1% respectively trained on SWBD only) | 9.2 | - | - |
| HMM-BLSTM trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher | 8.5 | - | - |
| IBM 2015 | 8.0 | The IBM 2015 English Conversational Telephone Speech Recognition System | - |
0 of 30 row(s) selected.