HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

MVSNet: Depth Inference for Unstructured Multi-view Stereo

Yao Yao; Zixin Luo; Shiwei Li; Tian Fang; Long Quan

MVSNet: Depth Inference for Unstructured Multi-view Stereo

Abstract

We present an end-to-end deep learning architecture for depth map inference from multi-view images. In the network, we first extract deep visual image features, and then build the 3D cost volume upon the reference camera frustum via the differentiable homography warping. Next, we apply 3D convolutions to regularize and regress the initial depth map, which is then refined with the reference image to generate the final output. Our framework flexibly adapts arbitrary N-view inputs using a variance-based cost metric that maps multiple features into one cost feature. The proposed MVSNet is demonstrated on the large-scale indoor DTU dataset. With simple post-processing, our method not only significantly outperforms previous state-of-the-arts, but also is several times faster in runtime. We also evaluate MVSNet on the complex outdoor Tanks and Temples dataset, where our method ranks first before April 18, 2018 without any fine-tuning, showing the strong generalization ability of MVSNet.

Code Repositories

YoYo000/MVSNet
tf
Mentioned in GitHub
xy-guo/MVSNet_pytorch
pytorch
Mentioned in GitHub
kwea123/MVSNet_pl
pytorch
Mentioned in GitHub
Skoltech-3D/sk3d_data
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-reconstruction-on-dtuMVSNet
Acc: 0.396
Comp: 0.527
Overall: 0.462
point-clouds-on-tanks-and-templesMVSNet
Mean F1 (Intermediate): 43.48

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MVSNet: Depth Inference for Unstructured Multi-view Stereo | Papers | HyperAI