HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking

Qiming Wang; Yongqiang Bai; Hongxing Song

Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking

Abstract

RGB-T tracking, a vital downstream task of object tracking, has made remarkable progress in recent years. Yet, it remains hindered by two major challenges: 1) the trade-off between performance and efficiency; 2) the scarcity of training data. To address the latter challenge, some recent methods employ prompts to fine-tune pre-trained RGB tracking models and leverage upstream knowledge in a parameter-efficient manner. However, these methods inadequately explore modality-independent patterns and disregard the dynamic reliability of different modalities in open scenarios. We propose M3PT, a novel RGB-T prompt tracking method that leverages middle fusion and multi-modal and multi-stage visual prompts to overcome these challenges. We pioneer the use of the adjustable middle fusion meta-framework for RGB-T tracking, which could help the tracker balance the performance with efficiency, to meet various demands of application. Furthermore, based on the meta-framework, we utilize multiple flexible prompt strategies to adapt the pre-trained model to comprehensive exploration of uni-modal patterns and improved modeling of fusion-modal features in diverse modality-priority scenarios, harnessing the potential of prompt learning in RGB-T tracking. Evaluating on 6 existing challenging benchmarks, our method surpasses previous state-of-the-art prompt fine-tuning methods while maintaining great competitiveness against excellent full-parameter fine-tuning methods, with only 0.34M fine-tuned parameters.

Benchmarks

BenchmarkMethodologyMetrics
rgb-t-tracking-on-lasherM3PT
Precision: 67.3
Success: 54.2
rgb-t-tracking-on-rgbt210M3PT
Precision: 83.9
Success: 60.8
rgb-t-tracking-on-rgbt234M3PT
Precision: 85.9
Success: 63.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking | Papers | HyperAI