Command Palette
Search for a command to run...
TP-GMOT: Tracking Generic Multiple Object by Textual Prompt with Motion-Appearance Cost (MAC) SORT
Anh Duy Le Dinh ; Tran Kim Hoang ; Le Ngan Hoang

Abstract
While Multi-Object Tracking (MOT) has made substantial advancements, it islimited by heavy reliance on prior knowledge and limited to predefinedcategories. In contrast, Generic Multiple Object Tracking (GMOT), trackingmultiple objects with similar appearance, requires less prior information aboutthe targets but faces challenges with variants like viewpoint, lighting,occlusion, and resolution. Our contributions commence with the introduction ofthe \textbf{\text{Refer-GMOT dataset}} a collection of videos, each accompaniedby fine-grained textual descriptions of their attributes. Subsequently, weintroduce a novel text prompt-based open-vocabulary GMOT framework, called\textbf{\text{TP-GMOT}}, which can track never-seen object categories with zerotraining examples. Within \text{TP-GMOT} framework, we introduce two novelcomponents: (i) {\textbf{\text{TP-OD}}, an object detection by a textualprompt}, for accurately detecting unseen objects with specific characteristics.(ii) Motion-Appearance Cost SORT \textbf{\text{MAC-SORT}}, a novel objectassociation approach that adeptly integrates motion and appearance-basedmatching strategies to tackle the complex task of tracking multiple genericobjects with high similarity. Our contributions are benchmarked on the\text{Refer-GMOT} dataset for GMOT task. Additionally, to assess thegeneralizability of the proposed \text{TP-GMOT} framework and the effectivenessof \text{MAC-SORT} tracker, we conduct ablation studies on the DanceTrack andMOT20 datasets for the MOT task. Our dataset, code, and models will be publiclyavailable at: https://fsoft-aic.github.io/TP-GMOT
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multiple-object-tracking-on-gmot-40 | MAC-SORT | HOTA: 58.58 IDF1: 71.7 MOTA: 67.77 |
| object-detection-on-gmot-40 | iGDINO MAC-SORT | mAP@0.5: 72.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.