HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SimVP: Simpler yet Better Video Prediction

Zhangyang Gao Cheng Tan Lirong Wu Stan Z. Li

SimVP: Simpler yet Better Video Prediction

Abstract

From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated training strategies. We admire these progresses but are confused about the necessity: is there a simple method that can perform comparably well? This paper proposes SimVP, a simple video prediction model that is completely built upon CNN and trained by MSE loss in an end-to-end fashion. Without introducing any additional tricks and complicated strategies, we can achieve state-of-the-art performance on five benchmark datasets. Through extended experiments, we demonstrate that SimVP has strong generalization and extensibility on real-world datasets. The significant reduction of training cost makes it easier to scale to complex scenarios. We believe SimVP can serve as a solid baseline to stimulate the further development of video prediction. The code is available at \href{https://github.com/gaozhangyang/SimVP-Simpler-yet-Better-Video-Prediction}{Github}.

Code Repositories

chengtan9907/simvpv2
pytorch
Mentioned in GitHub
chengtan9907/OpenSTL
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
video-prediction-on-human36mSimVP
MAE: 1510
MSE: 316
SSIM: 0.904
video-prediction-on-moving-mnistSimVP
MSE: 23.8
SSIM: 0.948

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp