HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Lipschitz Recurrent Neural Networks

N.Benjamin Erichson; Omri Azencot; Alejandro Queiruga; Liam Hodgkinson; Michael W. Mahoney

Lipschitz Recurrent Neural Networks

Abstract

Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity. This particular functional form facilitates stability analysis of the long-term behavior of the recurrent unit using tools from nonlinear systems theory. In turn, this enables architectural design decisions before experimentation. Sufficient conditions for global stability of the recurrent unit are obtained, motivating a novel scheme for constructing hidden-to-hidden matrices. Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks, including computer vision, language modeling and speech prediction tasks. Finally, through Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.

Code Repositories

erichson/LipschitzRNN
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
sequential-image-classification-on-noiseLipschitz RNN
% Test Accuracy: 59.0
sequential-image-classification-on-sequentialLipschitzRNN
Permuted Accuracy: 96.3%
Unpermuted Accuracy: 99.4
sequential-image-classification-on-sequential-1LipschitzRNN
Unpermuted Accuracy: 64.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lipschitz Recurrent Neural Networks | Papers | HyperAI