Command Palette
Search for a command to run...
Ziyang Song Zerong Wang Bo Li Hao Zhang Ruijie Zhu Li Liu Peng-Tao Jiang Tianzhu Zhang

Abstract
Monocular depth estimation within the diffusion-denoising paradigmdemonstrates impressive generalization ability but suffers from low inferencespeed. Recent methods adopt a single-step deterministic paradigm to improveinference efficiency while maintaining comparable performance. However, theyoverlook the gap between generative and discriminative features, leading tosuboptimal results. In this work, we propose DepthMaster, a single-stepdiffusion model designed to adapt generative features for the discriminativedepth estimation task. First, to mitigate overfitting to texture detailsintroduced by generative features, we propose a Feature Alignment module, whichincorporates high-quality semantic features to enhance the denoising network'srepresentation capability. Second, to address the lack of fine-grained detailsin the single-step deterministic framework, we propose a Fourier Enhancementmodule to adaptively balance low-frequency structure and high-frequencydetails. We adopt a two-stage training strategy to fully leverage the potentialof the two modules. In the first stage, we focus on learning the global scenestructure with the Feature Alignment module, while in the second stage, weexploit the Fourier Enhancement module to improve the visual quality. Throughthese efforts, our model achieves state-of-the-art performance in terms ofgeneralization and detail preservation, outperforming other diffusion-basedmethods across various datasets. Our project page can be found athttps://indu1ge.github.io/DepthMaster_page.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| monocular-depth-estimation-on-eth3d | DepthMaster | Delta u003c 1.25: 0.974 absolute relative error: 0.053 |
| monocular-depth-estimation-on-kitti-eigen | DepthMaster | Delta u003c 1.25: 0.937 absolute relative error: 0.082 |
| monocular-depth-estimation-on-nyu-depth-v2 | DepthMaster | Delta u003c 1.25: 0.972 absolute relative error: 0.050 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.