HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Improved Conditional VRNNs for Video Prediction

Lluis Castrejon; Nicolas Ballas; Aaron Courville

Improved Conditional VRNNs for Video Prediction

Abstract

Predicting future frames for a video sequence is a challenging generative modeling task. Promising approaches include probabilistic latent variable models such as the Variational Auto-Encoder. While VAEs can handle uncertainty and model multiple possible future outcomes, they have a tendency to produce blurry predictions. In this work we argue that this is a sign of underfitting. To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models. Our approach relies on a hierarchy of latent variables, which defines a family of flexible prior and posterior distributions in order to better model the probability of future sequences. We validate our proposal through a series of ablation experiments and compare our approach to current state-of-the-art latent variable models. Our method performs favorably under several metrics in three different datasets.

Code Repositories

facebookresearch/improved_vrnn
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
video-generation-on-bair-robot-pushingVRNN 1L
Cond: 2
FVD score: 149.22
LPIPS: 0.058±0.03
Pred: 28
SSIM: 0.829±0.06
Train: 10
video-generation-on-bair-robot-pushingHier-VRNN
Cond: 2
FVD score: 143.4
LPIPS: 0.055±0.03
Pred: 28
SSIM: 0.822±0.06
Train: 10
video-prediction-on-cityscapes-128x128Hier-VRNN
Cond.: 2
FVD: 567.51
LPIPS: 0.264 ± .07
Pred: 28
SSIM: 0.628±0.1
Train: 10

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Improved Conditional VRNNs for Video Prediction | Papers | HyperAI