HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

OadTR: Online Action Detection with Transformers

Xiang Wang; Shiwei Zhang; Zhiwu Qing; Yuanjie Shao; Zhengrong Zuo; Changxin Gao; Nong Sang

OadTR: Online Action Detection with Transformers

Abstract

Most recent approaches for online action detection tend to apply Recurrent Neural Network (RNN) to capture long-range temporal structure. However, RNN suffers from non-parallelism and gradient vanishing, hence it is hard to be optimized. In this paper, we propose a new encoder-decoder framework based on Transformers, named OadTR, to tackle these problems. The encoder attached with a task token aims to capture the relationships and global interactions between historical observations. The decoder extracts auxiliary information by aggregating anticipated future clip representations. Therefore, OadTR can recognize current actions by encoding historical information and predicting future context simultaneously. We extensively evaluate the proposed OadTR on three challenging datasets: HDD, TVSeries, and THUMOS14. The experimental results show that OadTR achieves higher training and inference speeds than current RNN based approaches, and significantly outperforms the state-of-the-art methods in terms of both mAP and mcAP. Code is available at https://github.com/wangxiang1230/OadTR.

Code Repositories

wangxiang1230/OadTR
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
online-action-detection-on-thumos-14OadTR
mAP: 65.2
online-action-detection-on-tvseriesOadTR
mCAP: 87.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
OadTR: Online Action Detection with Transformers | Papers | HyperAI