HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang Bingyi Kang Zilong Huang Xiaogang Xu Jiashi Feng Hengshuang Zhao

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Abstract

This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a better depth-conditioned ControlNet. Our models are released at https://github.com/LiheYoung/Depth-Anything.

Code Repositories

duan-song/SATNet
pytorch
Mentioned in GitHub
fabio-sim/Depth-Anything-ONNX
pytorch
Mentioned in GitHub
JTRNEO/SynRS3D
pytorch
Mentioned in GitHub
greatenanoymous/monodpt_grasp
pytorch
Mentioned in GitHub
LiheYoung/Depth-Anything
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
monocular-depth-estimation-on-eth3dDepth Anything
Delta u003c 1.25: 0.882
absolute relative error: 0.0127
monocular-depth-estimation-on-kitti-eigenDepth Anything
Delta u003c 1.25: 0.982
Delta u003c 1.25^2: 0.998
Delta u003c 1.25^3: 1.000
RMSE: 1.896
RMSE log: 0.069
Sq Rel: 0.121
absolute relative error: 0.046
monocular-depth-estimation-on-nyu-depth-v2Depth Anything
Delta u003c 1.25: 0.984
Delta u003c 1.25^2: 0.998
Delta u003c 1.25^3: 1.000
RMSE: 0.206
absolute relative error: 0.056
log 10: 0.024
semantic-segmentation-on-cityscapesDepth Anything
Mean IoU (class): 84.8%
semantic-segmentation-on-cityscapes-valDepth Anything
mIoU: 86.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | Papers | HyperAI