HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Video Frame Interpolation with Transformer

Liying Lu Ruizheng Wu Huaijia Lin Jiangbo Lu Jiaya Jia

Video Frame Interpolation with Transformer

Abstract

Video frame interpolation (VFI), which aims to synthesize intermediate frames of a video, has made remarkable progress with development of deep convolutional networks over past years. Existing methods built upon convolutional networks generally face challenges of handling large motion due to the locality of convolution operations. To overcome this limitation, we introduce a novel framework, which takes advantage of Transformer to model long-range pixel correlation among video frames. Further, our network is equipped with a novel cross-scale window-based attention mechanism, where cross-scale windows interact with each other. This design effectively enlarges the receptive field and aggregates multi-scale information. Extensive quantitative and qualitative experiments demonstrate that our method achieves new state-of-the-art results on various benchmarks.

Code Repositories

dvlab-research/vfiformer
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
video-frame-interpolation-on-msu-video-frameVFIformer
LPIPS: 0.044
MS-SSIM: 0.942
PSNR: 28.34
SSIM: 0.917
VMAF: 68.87

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Video Frame Interpolation with Transformer | Papers | HyperAI