Language Modelling On Penn Treebank Word

评估指标

Params
Test perplexity
Validation perplexity

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
TCN14.7M108.47-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Seq-U-Net14.9M107.95-Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
GRU (Bai et al., 2018)-92.48-An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
R-Transformer-84.38-R-Transformer: Recurrent Neural Network Enhanced Transformer
Zaremba et al. (2014) - LSTM (medium)-82.786.2Recurrent Neural Network Regularization
Gal & Ghahramani (2016) - Variational LSTM (medium)-79.781.9A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
LSTM (Bai et al., 2018)-78.93-An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Zaremba et al. (2014) - LSTM (large)-78.482.2Recurrent Neural Network Regularization
Gal & Ghahramani (2016) - Variational LSTM (large)-75.277.9A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
Inan et al. (2016) - Variational RHN-66.068.1Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
Recurrent highway networks23M65.467.9Recurrent Highway Networks
NAS-RL25M64.0-Neural Architecture Search with Reinforcement Learning
Efficient NAS24M 58.660.8Efficient Neural Architecture Search via Parameter Sharing
AWD-LSTM24M57.360.0Regularizing and Optimizing LSTM Language Models
DEQ-TrellisNet24M57.1-Deep Equilibrium Models
AWD-LSTM 3-layer with Fraternal dropout24M56.858.9Fraternal Dropout
Dense IndRNN-56.37-Deep Independently Recurrent Neural Network (IndRNN)
Differentiable NAS23M56.158.3DARTS: Differentiable Architecture Search
AWD-LSTM-DRILL24M55.758.2Deep Residual Output Layers for Neural Language Generation
2-layer skip-LSTM + dropout tuning 24M55.357.1Pushing the bounds of dropout
0 of 43 row(s) selected.
Language Modelling On Penn Treebank Word | SOTA | HyperAI超神经