Command Palette
Search for a command to run...
Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
Guozhen Zhang Yuhan Zhu Haonan Wang Youxin Chen Gangshan Wu Limin Wang

Abstract
Effectively extracting inter-frame motion and appearance information is important for video frame interpolation (VFI). Previous works either extract both types of information in a mixed way or elaborate separate modules for each type of information, which lead to representation ambiguity and low efficiency. In this paper, we propose a novel module to explicitly extract motion and appearance information via a unifying operation. Specifically, we rethink the information process in inter-frame attention and reuse its attention map for both appearance feature enhancement and motion information extraction. Furthermore, for efficient VFI, our proposed module could be seamlessly integrated into a hybrid CNN and Transformer architecture. This hybrid pipeline can alleviate the computational complexity of inter-frame attention as well as preserve detailed low-level structure information. Experimental results demonstrate that, for both fixed- and arbitrary-timestep interpolation, our method achieves state-of-the-art performance on various datasets. Meanwhile, our approach enjoys a lighter computation overhead over models with close performance. The source code and models are available at https://github.com/MCG-NJU/EMA-VFI.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| video-frame-interpolation-on-msu-video-frame | EMA-VFI | LPIPS: 0.022 MS-SSIM: 0.965 PSNR: 29.89 SSIM: 0.953 VMAF: 71.71 |
| video-frame-interpolation-on-snu-film-easy | EMA-VFI | PSNR: 39.98 SSIM: 0.9910 |
| video-frame-interpolation-on-snu-film-extreme | EMA-VFI | PSNR: 25.69 SSIM: 0.8661 |
| video-frame-interpolation-on-snu-film-hard | EMA-VFI | PSNR: 30.94 SSIM: 0.9392 |
| video-frame-interpolation-on-snu-film-medium | EMA-VFI | PSNR: 36.09 SSIM: 0.9801 |
| video-frame-interpolation-on-ucf101-1 | EMA-VFI | PSNR: 35.48 SSIM: 0.9701 |
| video-frame-interpolation-on-vimeo90k | EMA-VFI | PSNR: 36.64 SSIM: 0.9819 |
| video-frame-interpolation-on-x4k1000fps | EMA-VFI | PSNR: 31.46 |
| video-frame-interpolation-on-x4k1000fps-2k | EMA-VFI | PSNR: 32.85 |
| video-frame-interpolation-on-xiph-2k | EMA-VFI | PSNR: 36.90 SSIM: 0.945 |
| video-frame-interpolation-on-xiph-4k-1 | EMA-VFI | PSNR: 34.67 SSIM: 0.907 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.