HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

UnICORNN: A recurrent model for learning very long time dependencies

T. Konstantin Rusch; Siddhartha Mishra

UnICORNN: A recurrent model for learning very long time dependencies

Abstract

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

Code Repositories

tk-rusch/unicornn
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
sentiment-analysis-on-imdbUnICORNN
Accuracy: 88.4
sequential-image-classification-on-noiseUnICORNN
% Test Accuracy: 62.4
sequential-image-classification-on-sequentialUnICORNN
Permuted Accuracy: 98.4
time-series-classification-on-eigenwormsIndRNN
% Test Accuracy: 49.7
time-series-classification-on-eigenwormscoRNN
% Test Accuracy: 86.7
time-series-classification-on-eigenwormsUnICORNN
% Test Accuracy: 90.3
time-series-classification-on-eigenwormsexpRNN
% Test Accuracy: 40.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
UnICORNN: A recurrent model for learning very long time dependencies | Papers | HyperAI