4 months ago

Improving Neural Language Models with a Continuous Cache

Edouard Grave; Armand Joulin; Nicolas Usunier

Abstract

We propose an extension to neural network language models to adapt their prediction to the recent history. Our model is a simplified version of memory augmented networks, which stores past hidden activations as memory and accesses them through a dot product with the current hidden activation. This mechanism is very efficient and scales to very large memory sizes. We also draw a link between the use of external memory in neural network and cache models used with count based language models. We demonstrate on several language model datasets that our approach performs significantly better than recent memory augmented networks.

Code Repositories

Asteur/RERITES-AvgWeightDescentLSTM-PoetryGeneration

pytorch

Mentioned in GitHub

llppff/ptb-lstmorqrnn-pytorch

pytorch

Mentioned in GitHub

uclanlp/NamedEntityLanguageModel

pytorch

Mentioned in GitHub

AtheMathmo/lookahead-lstm

pytorch

Mentioned in GitHub

dmlc/gluon-nlp

mxnet

jb33k/awd-lstm-lm-ThinkNet

pytorch

Mentioned in GitHub

SachinIchake/KALM

pytorch

Mentioned in GitHub

jhave/RERITES-AvgWeightDescentLSTM-PoetryGeneration

pytorch

Mentioned in GitHub

philippwirth/treelangrnn

pytorch

Mentioned in GitHub

arvieFrydenlund/awd-lstm-lm

pytorch

Mentioned in GitHub

ari-holtzman/genlm

pytorch

Mentioned in GitHub

philippwirth/awd-lstm-test

pytorch

Mentioned in GitHub

soyoung97/awd-lstm-gru

pytorch

Mentioned in GitHub

salesforce/awd-lstm-lm

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
language-modelling-on-wikitext-103	Neural cache model (size = 100)	Test perplexity: 44.8
language-modelling-on-wikitext-103	Neural cache model (size = 2,000)	Test perplexity: 40.8
language-modelling-on-wikitext-103	LSTM	Test perplexity: 48.7
language-modelling-on-wikitext-2	Grave et al. (2016) - LSTM	Test perplexity: 99.3
language-modelling-on-wikitext-2	Grave et al. (2016) - LSTM + continuous cache pointer	Test perplexity: 68.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Improving Neural Language Models with a Continuous Cache

Edouard Grave; Armand Joulin; Nicolas Usunier

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters