HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Cross Fusion RGB-T Tracking with Bi-directional Adapter

Zhirong Zeng; Xiaotao Liu; Meng Sun; Hongyu Wang; Jing Liu

Cross Fusion RGB-T Tracking with Bi-directional Adapter

Abstract

Many state-of-the-art RGB-T trackers have achieved remarkable results through modality fusion. However, these trackers often either overlook temporal information or fail to fully utilize it, resulting in an ineffective balance between multi-modal and temporal information. To address this issue, we propose a novel Cross Fusion RGB-T Tracking architecture (CFBT) that ensures the full participation of multiple modalities in tracking while dynamically fusing temporal information. The effectiveness of CFBT relies on three newly designed cross spatio-temporal information fusion modules: Cross Spatio-Temporal Augmentation Fusion (CSTAF), Cross Spatio-Temporal Complementarity Fusion (CSTCF), and Dual-Stream Spatio-Temporal Adapter (DSTA). CSTAF employs a cross-attention mechanism to enhance the feature representation of the template comprehensively. CSTCF utilizes complementary information between different branches to enhance target features and suppress background features. DSTA adopts the adapter concept to adaptively fuse complementary information from multiple branches within the transformer layer, using the RGB modality as a medium. These ingenious fusions of multiple perspectives introduce only less than 0.3\% of the total modal parameters, but they indeed enable an efficient balance between multi-modal and temporal information. Extensive experiments on three popular RGB-T tracking benchmarks demonstrate that our method achieves new state-of-the-art performance.

Benchmarks

BenchmarkMethodologyMetrics
rgb-t-tracking-on-lasherCFBT
Precision: 73.2
Success: 58.4
rgb-t-tracking-on-rgbt210CFBT
Precision: 87.7
Success: 63.0
rgb-t-tracking-on-rgbt234CFBT
Precision: 89.9
Success: 65.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Cross Fusion RGB-T Tracking with Bi-directional Adapter | Papers | HyperAI