Command Palette
Search for a command to run...
Chen Yilun ; Liu Shu ; Shen Xiaoyong ; Jia Jiaya

Abstract
Most state-of-the-art 3D object detectors heavily rely on LiDAR sensorsbecause there is a large performance gap between image-based and LiDAR-basedmethods. It is caused by the way to form representation for the prediction in3D scenarios. Our method, called Deep Stereo Geometry Network (DSGN),significantly reduces this gap by detecting 3D objects on a differentiablevolumetric representation -- 3D geometric volume, which effectively encodes 3Dgeometric structure for 3D regular space. With this representation, we learndepth information and semantic cues simultaneously. For the first time, weprovide a simple and effective one-stage stereo-based 3D detection pipelinethat jointly estimates the depth and detects 3D objects in an end-to-endlearning manner. Our approach outperforms previous stereo-based 3D detectors(about 10 higher in terms of AP) and even achieves comparable performance withseveral LiDAR-based methods on the KITTI 3D object detection leaderboard. Ourcode is publicly available at https://github.com/chenyilun95/DSGN.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-object-detection-from-stereo-images-on-1 | DSGN | AP75: 52.18 |
| 3d-object-detection-from-stereo-images-on-2 | DSGN | AP50: 15.55 |
| 3d-object-detection-from-stereo-images-on-3 | DSGN | AP50: 18.17 |
| vehicle-pose-estimation-on-kitti-cars-hard | DSGN (Stereo) | Average Orientation Similarity: 78.27 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.