HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Temporal RoI Align for Video Object Recognition

Tao Gong Kai Chen Xinjiang Wang Qi Chu Feng Zhu Dahua Lin Nenghai Yu Huamin Feng

Temporal RoI Align for Video Object Recognition

Abstract

Video object detection is challenging in the presence of appearance deterioration in certain video frames. Therefore, it is a natural choice to aggregate temporal information from other frames of the same video into the current frame. However, RoI Align, as one of the most core procedures of video detectors, still remains extracting features from a single-frame feature map for proposals, making the extracted RoI features lack temporal information from videos. In this work, considering the features of the same object instance are highly similar among frames in a video, a novel Temporal RoI Align operator is proposed to extract features from other frames feature maps for current frame proposals by utilizing feature similarity. The proposed Temporal RoI Align operator can extract temporal information from the entire video for proposals. We integrate it into single-frame video detectors and other state-of-the-art video detectors, and conduct quantitative experiments to demonstrate that the proposed Temporal RoI Align operator can consistently and significantly boost the performance. Besides, the proposed Temporal RoI Align can also be applied into video instance segmentation. Codes are available at https://github.com/open-mmlab/mmtracking

Code Repositories

open-mmlab/mmtracking
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
video-instance-segmentation-on-youtube-visTemporal ROI Align
mask AP: 38
video-object-detection-on-epic-kitchens-1Temporal ROI Align
mAP: 39.6
video-object-detection-on-epic-kitchens-seenTemporal ROI Align
mAP: 42.2
video-object-detection-on-imagenet-vidTemporal ROI Align (ResNeXt101)
MAP : 84.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Temporal RoI Align for Video Object Recognition | Papers | HyperAI