Command Palette
Search for a command to run...
Luo Yueru ; Zheng Chaoda ; Yan Xu ; Kun Tang ; Zheng Chao ; Cui Shuguang ; Li Zhen

Abstract
3D lane detection from monocular images is a fundamental yet challenging taskin autonomous driving. Recent advances primarily rely on structural 3Dsurrogates (e.g., bird's eye view) built from front-view image features andcamera parameters. However, the depth ambiguity in monocular images inevitablycauses misalignment between the constructed surrogate feature map and theoriginal image, posing a great challenge for accurate lane detection. Toaddress the above issue, we present a novel LATR model, an end-to-end 3D lanedetector that uses 3D-aware front-view features without transformed viewrepresentation. Specifically, LATR detects 3D lanes via cross-attention basedon query and key-value pairs, constructed using our lane-aware query generatorand dynamic 3D ground positional embedding. On the one hand, each query isgenerated based on 2D lane-aware features and adopts a hybrid embedding toenhance lane information. On the other hand, 3D space information is injectedas positional embedding from an iteratively-updated 3D ground plane. LATRoutperforms previous state-of-the-art methods on both synthetic Apollo,realistic OpenLane and ONCE-3DLanes by large margins (e.g., 11.4 gain in termsof F1 score on OpenLane). Code will be released athttps://github.com/JMoonr/LATR .
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-lane-detection-on-openlane | LATR | Curve: 68.2 Extreme Weather: 57.1 F1 (all): 61.9 Intersection: 52.3 Merge u0026 Split: 61.5 Night: 55.4 Up u0026 Down: 55.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.