Command Palette
Search for a command to run...
{Wei Luo ; Feng Yu}
Abstract
Recurrent neural networks (RNNs) are challenging to train, let alone those with deep spatial structures. Architectures built upon highway connections such as Recurrent Highway Network (RHN) were developed to allow larger step-to-step transition depth, leading to more expressive models. However, problems that require capturing long-term dependencies still can not be well addressed by these models. Moreover, the ability to keep long-term memories tends to diminish when the spatial depth increases, since deeper structure may accelerate gradient vanishing. In this paper, we address these issues by proposing a novel RNN architecture based on RHN, namely the Recurrent Highway Network with Grouped Auxiliary Memory (GAM-RHN). The proposed architecture interconnects the RHN with a set of auxiliary memory units specifically for storing long-term information via reading and writing operations, which is analogous to Memory Augmented Neural Networks (MANNs). Experimental results on artificial long time lag tasks show that GAM-RHNs can be trained efficiently while being deep in both time and space. We also evaluate the proposed architecture on a variety of tasks, including language modeling, sequential image classification, and financial market forecasting. The potential of our approach is demonstrated by achieving state-of-the-art results on these tasks.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| language-modelling-on-penn-treebank-character | GAM-RHN-5 | Bit per Character (BPC): 1.147 Number of params: 16.0M |
| language-modelling-on-text8 | GAM-RHN-10 | Bit per Character (BPC): 1.157 Number of params: 44.7M |
| sequential-image-classification-on-sequential | GAM-RHN-1 | Permuted Accuracy: 96.8% |
| stock-trend-prediction-on-fi-2010 | BL-GAM-RHN-7 | Accuracy (H50): 0.8202 F1 (H50): 0.8088 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.