5 months ago

Single Image Depth Estimation Trained via Depth from Defocus Cues

Gur Shir ; Wolf Lior

Abstract

Estimating depth from a single RGB images is a fundamental task in computervision, which is most directly solved using supervised deep learning. In thefield of unsupervised learning of depth from a single RGB image, depth is notgiven explicitly. Existing work in the field receives either a stereo pair, amonocular video, or multiple views, and, using losses that are based onstructure-from-motion, trains a depth estimation network. In this work, werely, instead of different views, on depth from focus cues. Learning is basedon a novel Point Spread Function convolutional layer, which applies locationspecific kernels that arise from the Circle-Of-Confusion in each imagelocation. We evaluate our method on data derived from five common datasets fordepth estimation and lightfield images, and present results that are on parwith supervised methods on KITTI and Make3D datasets and outperformunsupervised learning approaches. Since the phenomenon of depth from defocus isnot dataset specific, we hypothesize that learning based on it would overfitless to the specific content in each dataset. Our experiments show that this isindeed the case, and an estimator learned on one dataset using our methodprovides better results on other datasets, than the directly supervisedmethods.

Code Repositories

shirgur/UnsupervisedDepthFromFocus

pytorch

Benchmarks

Benchmark	Methodology	Metrics
monocular-depth-estimation-on-kitti-eigen	DeepLabV3+ (F10)	absolute relative error: 0.110
monocular-depth-estimation-on-nyu-depth-v2	DeepLabV3+ (F10)	RMSE: 0.575

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette