Command Palette
Search for a command to run...
T. Konstantin Rusch; Siddhartha Mishra; N. Benjamin Erichson; Michael W. Mahoney

Abstract
We propose a novel method called Long Expressive Memory (LEM) for learning long-term sequential dependencies. LEM is gradient-based, it can efficiently process sequential tasks with very long-term dependencies, and it is sufficiently expressive to be able to learn complicated input-output maps. To derive LEM, we consider a system of multiscale ordinary differential equations, as well as a suitable time-discretization of this system. For LEM, we derive rigorous bounds to show the mitigation of the exploding and vanishing gradients problem, a well-known challenge for gradient-based recurrent sequential learning methods. We also prove that LEM can approximate a large class of dynamical systems to high accuracy. Our empirical results, ranging from image and time-series classification through dynamical systems prediction to speech recognition and language modeling, demonstrate that LEM outperforms state-of-the-art recurrent neural networks, gated recurrent units, and long short-term memory models.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| sequential-image-classification-on-noise | LEM | % Test Accuracy: 60.5 |
| sequential-image-classification-on-sequential | LEM | Permuted Accuracy: 96.6% Unpermuted Accuracy: 99.5% |
| time-series-classification-on-eigenworms | LEM | % Test Accuracy: 92.3 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.