8 months ago

Abstract

Video transformers have become the de facto standard for human actionrecognition, yet their exclusive reliance on the RGB modality still limitstheir adoption in certain domains. One such domain is Activities of DailyLiving (ADL), where RGB alone is not sufficient to distinguish between visuallysimilar actions, or actions observed from multiple viewpoints. To facilitatethe adoption of video transformers for ADL, we hypothesize that theaugmentation of RGB with human pose information, known for its sensitivity tofine-grained motion and multiple viewpoints, is essential. Consequently, weintroduce the first Pose Induced Video Transformer: PI-ViT (or $\pi$ -ViT), anovel approach that augments the RGB representations learned by videotransformers with 2D and 3D pose information. The key elements of $\pi$ -ViT aretwo plug-in modules, 2D Skeleton Induction Module and 3D Skeleton InductionModule, that are responsible for inducing 2D and 3D pose information into theRGB representations. These modules operate by performing pose-aware auxiliarytasks, a design choice that allows $\pi$ -ViT to discard the modules duringinference. Notably, $\pi$ -ViT achieves the state-of-the-art performance onthree prominent ADL datasets, encompassing both real-world and large-scaleRGB-D datasets, without requiring poses or additional computational overhead atinference.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Dominick Reilly Srijan Das

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

8 months ago

Dominick Reilly Srijan Das

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Just Add π\piπ! Pose Induced Video Transformers for Understanding Activities of Daily Living

Dominick Reilly Srijan Das

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Just Add π\piπ! Pose Induced Video Transformers for Understanding Activities of Daily Living

Dominick Reilly Srijan Das

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Just Add π\piπ! Pose Induced Video Transformers for Understanding Activities of Daily Living

Dominick Reilly Srijan Das

Abstract

Build AI with AI

HyperAI Newsletters

Just Add $\pi$ ! Pose Induced Video Transformers for Understanding Activities of Daily Living

Just Add $\pi$ ! Pose Induced Video Transformers for Understanding Activities of Daily Living

Just Add $\pi$ ! Pose Induced Video Transformers for Understanding Activities of Daily Living