Language Modelling On Wikitext 2

评估指标

Number of params
Test perplexity
Validation perplexity

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
OPT-175B (50% Sparsity)-234.77-SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Grave et al. (2016) - LSTM-99.3-Improving Neural Language Models with a Continuous Cache
Inan et al. (2016) - Variational LSTM (tied) (h=650)-87.792.3Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
Inan et al. (2016) - Variational LSTM (tied) (h=650) + augmented loss-87.091.5Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
EGRU-68.9-Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
Grave et al. (2016) - LSTM + continuous cache pointer-68.9-Improving Neural Language Models with a Continuous Cache
Melis et al. (2017) - 1-layer LSTM (tied)24M65.969.3On the State of the Art of Evaluation in Neural Language Models
AWD-LSTM33M65.868.6Regularizing and Optimizing LSTM Language Models
AWD-LSTM + ATOI33M64.7367.47Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
AWD-LSTM 3-layer with Fraternal dropout34M64.166.8Fraternal Dropout
AWD-LSTM-DRILL34M61.964.9Deep Residual Output Layers for Neural Language Generation
AWD-FWM Schlag et al. (2020)37M61.6554.48Learning Associative Inference Using Fast Weight Memory
AWD-LSTM-MoS35M61.4563.88Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
AWD-LSTM-MoS + Partial Shuffle35M59.9862.38Partially Shuffling the Training Data to Improve Language Models
AWD-LSTM-DOC37M58.0360.29Direct Output Connection for a High-Rank Language Model
AWD-LSTM-DOC + Partial Shuffle37M57.8560.16Partially Shuffling the Training Data to Improve Language Models
Mogrifier LSTM35M55.157.3Mogrifier LSTM
Ensemble of All-53.7355.4Advancing State of the Art in Language Modeling
AWD-LSTM-DOC x5185M53.0954.19Direct Output Connection for a High-Rank Language Model
AWD-LSTM + continuous cache pointer33M52.053.8Regularizing and Optimizing LSTM Language Models
0 of 38 row(s) selected.
Language Modelling On Wikitext 2 | SOTA | HyperAI超神经