3 months ago

LMOT: Efficient Light-Weight Detection and Tracking in Crowds

{AbdElMoniem Bayoumi Hoda Baraka Rana Mostafa}

Abstract

Multi-object tracking is a vital component in various robotics and computer vision applications. However, existing multi-object tracking techniques trade off computation runtime for tracking accuracy leading to challenges in deploying such pipelines in real-time applications. This paper introduces a novel real-time model, LMOT, i.e., Light-weight Multi-Object Tracker, that performs joint pedestrian detection and tracking. LMOT introduces a simplified DLA-34 encoder network to extract detection features for the current image that are computationally efficient. Furthermore, we generate efficient tracking features using a linear transformer for the prior image frame and its corresponding detection heatmap. After that, LMOT fuses both detection and tracking feature maps in a multi-layer scheme and performs a two-stage online data association relying on the Kalman filter to generate tracklets. We evaluated our model on the challenging real-world MOT16/17/20 datasets, showing LMOT significantly outperforms the state-of-the-art trackers concerning runtime while maintaining high robustness. LMOT is approximately ten times faster than state-of-the-art trackers while being only 3.8% behind in performance accuracy on average leading to a much computationally lighter model.

Benchmarks

Benchmark	Methodology	Metrics
multi-object-tracking-on-mot16	LMOT	IDF1: 72.3 IDs: 669 MOTA: 73.2
multi-object-tracking-on-mot17	LMOT	IDF1: 70.3 MOTA: 72.0
multi-object-tracking-on-mot20-1	LMOT	IDF1: 61.1 MOTA: 59.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning