3 months ago

Compressive Transformers for Long-Range Sequence Modelling

Jack W. Rae Anna Potapenko Siddhant M. Jayakumar Timothy P. Lillicrap

Abstract

We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning. We find the Compressive Transformer obtains state-of-the-art language modelling results in the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc respectively. We also find it can model high-frequency speech effectively and can be used as a memory mechanism for RL, demonstrated on an object matching task. To promote the domain of long-range sequence learning, we propose a new open-vocabulary language modelling benchmark derived from books, PG-19.

Code Repositories

ViktorStagge/CompressiveTransformer

Mentioned in GitHub

lucidrains/compressive-transformer-pytorch

pytorch

lucidrains/block-recurrent-transformer-pytorch

pytorch

Mentioned in GitHub

labmlai/annotated_deep_learning_paper_implementations

pytorch

deepmind/pg19

Mentioned in GitHub

google-deepmind/pg19

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
language-modelling-on-enwiki8	Compressive Transformer (24 layers)	Bit per Character (BPC): 0.97 Number of params: 277M
language-modelling-on-hutter-prize	Compressive Transformer	Bit per Character (BPC): 0.97
language-modelling-on-wikitext-103	Compressive Transformer (18L, M=1024)	Test perplexity: 17.1 Validation perplexity: 16.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Compressive Transformers for Long-Range Sequence Modelling

Jack W. Rae Anna Potapenko Siddhant M. Jayakumar Timothy P. Lillicrap

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters