HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

Huang Kuan-Chih ; Wu Tsung-Han ; Su Hung-Ting ; Hsu Winston H.

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

Abstract

Monocular 3D object detection is an important yet challenging task inautonomous driving. Some existing methods leverage depth information from anoff-the-shelf depth estimator to assist 3D detection, but suffer from theadditional computational burden and achieve limited performance caused byinaccurate depth priors. To alleviate this, we propose MonoDTR, a novelend-to-end depth-aware transformer network for monocular 3D object detection.It mainly consists of two components: (1) the Depth-Aware Feature Enhancement(DFE) module that implicitly learns depth-aware features with auxiliarysupervision without requiring extra computation, and (2) the Depth-AwareTransformer (DTR) module that globally integrates context- and depth-awarefeatures. Moreover, different from conventional pixel-wise positionalencodings, we introduce a novel depth positional encoding (DPE) to inject depthpositional hints into transformers. Our proposed depth-aware modules can beeasily plugged into existing image-only monocular 3D object detectors toimprove the performance. Extensive experiments on the KITTI dataset demonstratethat our approach outperforms previous state-of-the-art monocular-based methodsand achieves real-time detection. Code is available athttps://github.com/kuanchihhuang/MonoDTR

Code Repositories

kuanchihhuang/monodtr
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-object-detection-from-monocular-images-on-7MonoDTR
AP25: 39.76
AP50: 3.02

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer | Papers | HyperAI