3 months ago

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang Bingyi Kang Zilong Huang Xiaogang Xu Jiashi Feng Hengshuang Zhao

Abstract

This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, we aim to build a simple yet powerful foundation model dealing with any images under any circumstances. To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error. We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos. It demonstrates impressive generalization ability. Further, through fine-tuning it with metric depth information from NYUv2 and KITTI, new SOTAs are set. Our better depth model also results in a better depth-conditioned ControlNet. Our models are released at https://github.com/LiheYoung/Depth-Anything.

Code Repositories

duan-song/SATNet

pytorch

Mentioned in GitHub

spacewalk01/depth-anything-tensorrt

pytorch

MindCode-4/code-3/tree/main/depth_anything

mindspore

fabio-sim/Depth-Anything-ONNX

pytorch

Mentioned in GitHub

JTRNEO/SynRS3D

pytorch

Mentioned in GitHub

greatenanoymous/monodpt_grasp

pytorch

Mentioned in GitHub

LiheYoung/Depth-Anything

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
monocular-depth-estimation-on-eth3d	Depth Anything	Delta u003c 1.25: 0.882 absolute relative error: 0.0127
monocular-depth-estimation-on-kitti-eigen	Depth Anything	Delta u003c 1.25: 0.982 Delta u003c 1.25^2: 0.998 Delta u003c 1.25^3: 1.000 RMSE: 1.896 RMSE log: 0.069 Sq Rel: 0.121 absolute relative error: 0.046
monocular-depth-estimation-on-nyu-depth-v2	Depth Anything	Delta u003c 1.25: 0.984 Delta u003c 1.25^2: 0.998 Delta u003c 1.25^3: 1.000 RMSE: 0.206 absolute relative error: 0.056 log 10: 0.024
semantic-segmentation-on-cityscapes	Depth Anything	Mean IoU (class): 84.8%
semantic-segmentation-on-cityscapes-val	Depth Anything	mIoU: 86.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang Bingyi Kang Zilong Huang Xiaogang Xu Jiashi Feng Hengshuang Zhao

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters