HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Long Short-Term Transformer for Online Action Detection

Xu Mingze ; Xiong Yuanjun ; Chen Hao ; Li Xinyu ; Xia Wei ; Tu Zhuowen ; Soatto Stefano

Long Short-Term Transformer for Online Action Detection

Abstract

We present Long Short-term TRansformer (LSTR), a temporal modeling algorithmfor online action detection, which employs a long- and short-term memorymechanism to model prolonged sequence data. It consists of an LSTR encoder thatdynamically leverages coarse-scale historical information from an extendedtemporal window (e.g., 2048 frames spanning of up to 8 minutes), together withan LSTR decoder that focuses on a short time window (e.g., 32 frames spanning 8seconds) to model the fine-scale characteristics of the data. Compared to priorwork, LSTR provides an effective and efficient method to model long videos withfewer heuristics, which is validated by extensive empirical analysis. LSTRachieves state-of-the-art performance on three standard online action detectionbenchmarks, THUMOS'14, TVSeries, and HACS Segment. Code has been made availableat: https://xumingze0308.github.io/projects/lstr

Code Repositories

amazon-research/long-short-term-transformer
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
online-action-detection-on-thumos-14LSTR
mAP: 69.5
online-action-detection-on-tvseriesLSTR
mCAP: 89.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Long Short-Term Transformer for Online Action Detection | Papers | HyperAI