8 months ago

Convolutional Neural Network

Multi-Task Learning

Action Recognition

Method/Architecture

Computer Vision

Parmar Paritosh ; Morris Brendan

Abstract

Spatiotemporal representations learned using 3D convolutional neural networks(CNN) are currently used in state-of-the-art approaches for action relatedtasks. However, 3D-CNN are notorious for being memory and compute resourceintensive as compared with more simple 2D-CNN architectures. We propose tohallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNNstudent. By requiring the 2D-CNN to predict the future and intuit upcomingactivity, it is encouraged to gain a deeper understanding of actions and howthey evolve. The hallucination task is treated as an auxiliary task, which canbe used with any other action related task in a multitask learning setting.Thorough experimental evaluation shows that the hallucination task indeed helpsimprove performance on action recognition, action quality assessment, anddynamic scene recognition tasks. From a practical standpoint, being able tohallucinate spatiotemporal representations without an actual 3D-CNN can enabledeployment in resource-constrained scenarios, such as with limited computingpower and/or lower bandwidth. Codebase is available here:https://github.com/ParitoshParmar/HalluciNet.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Convolutional Neural Network

Multi-Task Learning

Action Recognition

Method/Architecture

Computer Vision

Parmar Paritosh ; Morris Brendan

Abstract

Spatiotemporal representations learned using 3D convolutional neural networks(CNN) are currently used in state-of-the-art approaches for action relatedtasks. However, 3D-CNN are notorious for being memory and compute resourceintensive as compared with more simple 2D-CNN architectures. We propose tohallucinate spatiotemporal representations from a 3D-CNN teacher with a 2D-CNNstudent. By requiring the 2D-CNN to predict the future and intuit upcomingactivity, it is encouraged to gain a deeper understanding of actions and howthey evolve. The hallucination task is treated as an auxiliary task, which canbe used with any other action related task in a multitask learning setting.Thorough experimental evaluation shows that the hallucination task indeed helpsimprove performance on action recognition, action quality assessment, anddynamic scene recognition tasks. From a practical standpoint, being able tohallucinate spatiotemporal representations without an actual 3D-CNN can enabledeployment in resource-constrained scenarios, such as with limited computingpower and/or lower bandwidth. Codebase is available here:https://github.com/ParitoshParmar/HalluciNet.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN | Papers | HyperAI