Command Palette
Search for a command to run...
Jia-Wang Bian Huangying Zhan Naiyan Wang Zhichao Li Le Zhang Chunhua Shen Ming-Ming Cheng Ian Reid

Abstract
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training and enables the scale-consistent prediction at inference time. Our contributions include: (i) we propose a geometry consistency loss, which penalizes the inconsistency of predicted depths between adjacent views; (ii) we propose a self-discovered mask to automatically localize moving objects that violate the underlying static scene assumption and cause noisy signals during training; (iii) we demonstrate the efficacy of each component with a detailed ablation study and show high-quality depth estimation results in both KITTI and NYUv2 datasets. Moreover, thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system for more robust and accurate tracking. The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training. Finally, we provide several demos for qualitative evaluation.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| monocular-depth-estimation-on-kitti-eigen | SC-Depth (ResNet18) | Delta u003c 1.25: 0.863 Delta u003c 1.25^2: 0.957 Delta u003c 1.25^3: 0.981 RMSE: 4.950 RMSE log: 0.197 absolute relative error: 0.119 |
| monocular-depth-estimation-on-kitti-eigen | SC-Depth (ResNet 50) | Delta u003c 1.25: 0.873 Delta u003c 1.25^2: 0.960 Delta u003c 1.25^3: 0.982 RMSE: 4.706 RMSE log: 0.191 absolute relative error: 0.114 |
| monocular-depth-estimation-on-nyu-depth-v2-4 | Bian et al | Absolute relative error (AbsRel): 0.157 Root mean square error (RMSE): 0.593 delta_1: 78.0 delta_2: 94.0 delta_3: 98.4 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.