| LSTM-8192-1024 + CNN Input | 1.04B | 30.0 | Exploring the Limits of Language Modeling | |
| 10 LSTM+CNN inputs + SNM10-SKIP (ensemble) | 43B | 23.7 | Exploring the Limits of Language Modeling | |
| Adaptive Input Very Large | 1.0B | 23.02 | Adaptive Input Representations for Neural Language Modeling | |