HyperAIHyperAI

Command Palette

Search for a command to run...

SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory

Cheng-Yen Yang Hsiang-Wei Huang Wenhao Chai Zhongyu Jiang Jenq-Neng Hwang

Abstract

The Segment Anything Model 2 (SAM 2) has demonstrated strong performance inobject segmentation tasks but faces challenges in visual object tracking,particularly when managing crowded scenes with fast-moving or self-occludingobjects. Furthermore, the fixed-window memory approach in the original modeldoes not consider the quality of memories selected to condition the imagefeatures for the next frame, leading to error propagation in videos. This paperintroduces SAMURAI, an enhanced adaptation of SAM 2 specifically designed forvisual object tracking. By incorporating temporal motion cues with the proposedmotion-aware memory selection mechanism, SAMURAI effectively predicts objectmotion and refines mask selection, achieving robust, accurate tracking withoutthe need for retraining or fine-tuning. SAMURAI operates in real-time anddemonstrates strong zero-shot performance across diverse benchmark datasets,showcasing its ability to generalize without fine-tuning. In evaluations,SAMURAI achieves significant improvements in success rate and precision overexisting trackers, with a 7.1% AUC gain on LaSOT_{ext} and a 3.5% AOgain on GOT-10k. Moreover, it achieves competitive results compared to fullysupervised methods on LaSOT, underscoring its robustness in complex trackingscenarios and its potential for real-world applications in dynamicenvironments. Code and results are available athttps://github.com/yangchris11/samurai.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory | Papers | HyperAI