6 months ago

Object Tracking

Computer Vision

Video Processing

Computer Vision

Seokju Cho Jiahui Huang Jisu Nam Honggyu An Seungryong Kim Joon-Young Lee

Abstract

We introduce LocoTrack, a highly accurate and efficient model designed forthe task of tracking any point (TAP) across video sequences. Previousapproaches in this task often rely on local 2D correlation maps to establishcorrespondences from a point in the query image to a local region in the targetimage, which often struggle with homogeneous regions or repetitive features,leading to matching ambiguities. LocoTrack overcomes this challenge with anovel approach that utilizes all-pair correspondences across regions, i.e.,local 4D correlation, to establish precise correspondences, with bidirectionalcorrespondence and matching smoothness significantly enhancing robustnessagainst ambiguities. We also incorporate a lightweight correlation encoder toenhance computational efficiency, and a compact Transformer architecture tointegrate long-term temporal information. LocoTrack achieves unmatched accuracyon all TAP-Vid benchmarks and operates at a speed almost 6 times faster thanthe current state-of-the-art.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

6 months ago

Object Tracking

Computer Vision

Video Processing

Computer Vision

Seokju Cho Jiahui Huang Jisu Nam Honggyu An Seungryong Kim Joon-Young Lee

Abstract

We introduce LocoTrack, a highly accurate and efficient model designed forthe task of tracking any point (TAP) across video sequences. Previousapproaches in this task often rely on local 2D correlation maps to establishcorrespondences from a point in the query image to a local region in the targetimage, which often struggle with homogeneous regions or repetitive features,leading to matching ambiguities. LocoTrack overcomes this challenge with anovel approach that utilizes all-pair correspondences across regions, i.e.,local 4D correlation, to establish precise correspondences, with bidirectionalcorrespondence and matching smoothness significantly enhancing robustnessagainst ambiguities. We also incorporate a lightweight correlation encoder toenhance computational efficiency, and a compact Transformer architecture tointegrate long-term temporal information. LocoTrack achieves unmatched accuracyon all TAP-Vid benchmarks and operates at a speed almost 6 times faster thanthe current state-of-the-art.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

Local All-Pair Correspondence for Point Tracking | Papers | HyperAI