5 months ago

Recurrent Batch Normalization

Tim Cooijmans; Nicolas Ballas; César Laurent; Çağlar Gülçehre; Aaron Courville

Abstract

We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.

Code Repositories

cooijmanstim/recurrent-batch-normalization

pytorch

Mentioned in GitHub

codedecde/Recognizing-Textual-Entailment

pytorch

Mentioned in GitHub

Tetsuya-Nishikawa/ConvLSTM_DEMO

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
language-modelling-on-text8	BN LSTM	Bit per Character (BPC): 1.36 Number of params: 16M
sequential-image-classification-on-sequential	BN LSTM	Permuted Accuracy: 95.4% Unpermuted Accuracy: 99%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Recurrent Batch Normalization

Tim Cooijmans; Nicolas Ballas; César Laurent; Çağlar Gülçehre; Aaron Courville

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters