HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding

Xiaokai Chen; Ke Gao

DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding

Abstract

Many of the leading approaches for video understanding are data-hungry and time-consuming, failing to capture the gist of spatial-temporal evolution in an efficient manner. The latest research shows that CNN network can reason about static relation of entities in images. To further exploit its capacity in dynamic evolution reasoning, we introduce a novel network module called DenseImage Network(DIN) with two main contributions. 1) A novel compact representation of video which distills its significant spatial-temporal evolution into a matrix called DenseImage, primed for efficient video encoding. 2) A simple yet powerful learning strategy based on DenseImage and a temporal-order-preserving CNN network is proposed for video understanding, which contains a local temporal correlation constraint capturing temporal evolution at multiple time scales with different filter widths. Extensive experiments on two recent challenging benchmarks demonstrate that our DenseImage Network can accurately capture the common spatial-temporal evolution between similar actions, even with enormous visual variations or different time scales. Moreover, we obtain the state-of-the-art results in action and gesture recognition with much less time-and-memory cost, indicating its immense potential in video representing and understanding.

Benchmarks

BenchmarkMethodologyMetrics
action-recognition-in-videos-on-jester-1DIN
Val: 95.31
action-recognition-in-videos-on-something-3DIN
Top-1 Accuracy: 34.11

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
DenseImage Network: Video Spatial-Temporal Evolution Encoding and Understanding | Papers | HyperAI