
摘要
我们提出Siam R-CNN,一种基于Siamese结构的再检测架构,充分挖掘了两阶段目标检测方法在视觉目标跟踪中的潜力。该方法结合了一种新颖的基于轨迹片段(tracklet)的动态规划算法,该算法利用首帧模板和前一帧预测结果的再检测信息,对目标物体及其潜在干扰物的历史轨迹进行完整建模。这一机制使我们的方法能够做出更优的跟踪决策,并在经历长时间遮挡后仍能有效重新检测目标。此外,我们提出了一种新型的难例挖掘策略,显著提升了Siam R-CNN在面对外观相似物体时的鲁棒性。在十个主流跟踪基准测试中,Siam R-CNN取得了当前最优的性能表现,尤其在长期跟踪任务中展现出卓越的能力。相关代码与模型已公开,可访问:www.vision.rwth-aachen.de/page/siamrcnn。
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| object-tracking-on-coesot | SiamR-CNN | Precision Rate: 67.5 Success Rate: 60.9 |
| semi-supervised-video-object-segmentation-on-1 | Siam R-CNN | F-measure (Decay): 20.2 F-measure (Mean): 58.6 F-measure (Recall): 62.3 Ju0026F: 53.3 Jaccard (Decay): 21.8 Jaccard (Mean): 48.0 Jaccard (Recall): 53.9 |
| visual-object-tracking-on-davis-2016 | Siam R-CNN | F-measure (Decay): 4.0 F-measure (Mean): 80.4 F-measure (Recall): 87.6 Ju0026F: 78.6 Jaccard (Decay): 2.2 Jaccard (Mean): 76.8 Jaccard (Recall): 86.4 |
| visual-object-tracking-on-davis-2017 | Siam R-CNN | F-measure (Decay): 16.2 F-measure (Mean): 75.0 F-measure (Recall): 82.8 Ju0026F: 70.55 Jaccard (Decay): 15.8 Jaccard (Mean): 66.1 Jaccard (Recall): 74.8 |
| visual-object-tracking-on-got-10k | Siam R-CNN | Average Overlap: 64.9 Success Rate 0.5: 72.8 |
| visual-object-tracking-on-lasot | Siam R-CNN | AUC: 64.8 Normalized Precision: 72.2 |
| visual-object-tracking-on-trackingnet | Siam R-CNN | Accuracy: 81.2 Normalized Precision: 85.4 Precision: 80.0 |