Command Palette
Search for a command to run...
PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning
Yunbo Wang; Zhifeng Gao; Mingsheng Long; Jianmin Wang; Philip S. Yu

Abstract
We present PredRNN++, an improved recurrent network for video predictive learning. In pursuit of a greater spatiotemporal modeling capability, our approach increases the transition depth between adjacent states by leveraging a novel recurrent unit, which is named Causal LSTM for re-organizing the spatial and temporal memories in a cascaded mechanism. However, there is still a dilemma in video predictive learning: increasingly deep-in-time models have been designed for capturing complex variations, while introducing more difficulties in the gradient back-propagation. To alleviate this undesirable effect, we propose a Gradient Highway architecture, which provides alternative shorter routes for gradient flows from outputs back to long-range inputs. This architecture works seamlessly with causal LSTMs, enabling PredRNN++ to capture short-term and long-term dependencies adaptively. We assess our model on both synthetic and real video datasets, showing its ability to ease the vanishing gradient problem and yield state-of-the-art prediction results even in a difficult objects occlusion scenario.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| video-prediction-on-kth | PredRNN++ | Cond: 10 PSNR: 28.47 Pred: 20 SSIM: 0.865 |
| video-prediction-on-moving-mnist | Causal LSTM | MAE: 106.8 MSE: 46.5 SSIM: 0.898 |
| video-prediction-on-synpickvp | PredRNN++ | LPIPS: 0.053 MSE: 51.73 PSNR: 27.50 SSIM: 0.894 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.