HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

Boyang Xia Wenhao Wu Haoran Wang Rui Su Dongliang He Haosen Yang Xiaoran Fan Wanli Ouyang

NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition

Abstract

It is challenging for artificial intelligence systems to achieve accurate video recognition under the scenario of low computation costs. Adaptive inference based efficient video recognition methods typically preview videos and focus on salient parts to reduce computation costs. Most existing works focus on complex networks learning with video classification based objectives. Taking all frames as positive samples, few of them pay attention to the discrimination between positive samples (salient frames) and negative samples (non-salient frames) in supervisions. To fill this gap, in this paper, we propose a novel Non-saliency Suppression Network (NSNet), which effectively suppresses the responses of non-salient frames. Specifically, on the frame level, effective pseudo labels that can distinguish between salient and non-salient frames are generated to guide the frame saliency learning. On the video level, a temporal attention module is learned under dual video-level supervisions on both the salient and the non-salient representations. Saliency measurements from both two levels are combined for exploitation of multi-granularity complementary information. Extensive experiments conducted on four well-known benchmarks verify our NSNet not only achieves the state-of-the-art accuracy-efficiency trade-off but also present a significantly faster (2.4~4.3x) practical inference speed than state-of-the-art methods. Our project page is at https://lawrencexia2008.github.io/projects/nsnet .

Benchmarks

BenchmarkMethodologyMetrics
action-recognition-in-videos-on-activitynetNSNet (w/ Swin-L)
mAP: 94.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition | Papers | HyperAI