Command Palette
Search for a command to run...
Christoph Mayer; Martin Danelljan; Goutam Bhat; Matthieu Paul; Danda Pani Paudel; Fisher Yu; Luc Van Gool

Abstract
Optimization based tracking methods have been widely successful by integrating a target model prediction module, providing effective global reasoning by minimizing an objective function. While this inductive bias integrates valuable domain knowledge, it limits the expressivity of the tracking network. In this work, we therefore propose a tracker architecture employing a Transformer-based model prediction module. Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models. We further extend the model predictor to estimate a second set of weights that are applied for accurate bounding box regression. The resulting tracker relies on training and on test frame information in order to predict all weights transductively. We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets. Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| video-object-tracking-on-nv-vot211 | ToMP-50 | AUC: 39.25 Precision: 53.01 |
| visual-object-tracking-on-avist | ToMP | Success Rate: 52.5 |
| visual-object-tracking-on-lasot | ToMP | Precision: 67.1 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.