Language Modelling On Hutter Prize

评估指标

Bit per Character (BPC)
Number of params

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
RHN - depth 5 [zilly2016recurrent]1.31-Recurrent Highway Networks
FS-LSTM-41.27727MFast-Slow Recurrent Neural Networks
Large RHN1.2746MRecurrent Highway Networks
Large FS-LSTM-41.24547MFast-Slow Recurrent Neural Networks
Large mLSTM +emb +WN +VD1.2446MMultiplicative LSTM for sequence modelling
3-layer AWD-LSTM1.23247MAn Analysis of Neural Language Modeling at Multiple Scales
Mogrifier LSTM1.12296MMogrifier LSTM
12-layer Character Transformer Model1.1144MCharacter-Level Language Modeling with Deeper Self-Attention
mLSTM + dynamic eval1.0846MDynamic Evaluation of Neural Sequence Models
12-layer Transformer-XL1.0641MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
64-layer Character Transformer Model1.06235MCharacter-Level Language Modeling with Deeper Self-Attention
18-layer Transformer-XL1.0388MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Longformer Small1.0041MLongformer: The Long-Document Transformer
Longformer Large0.99102MLongformer: The Long-Document Transformer
24-layer Transformer-XL0.99277MTransformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Mogrifier LSTM + dynamic eval0.98896MMogrifier LSTM
Compressive Transformer0.97-Compressive Transformers for Long-Range Sequence Modelling
Transformer-XL + RMS dynamic eval0.94277MDynamic Evaluation of Transformer Language Models
0 of 18 row(s) selected.
Language Modelling On Hutter Prize | SOTA | HyperAI超神经