HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

Kaichen Zhou Changhao Chen Bing Wang Muhamad Risqi U. Saputra Niki Trigoni Andrew Markham

VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization

Abstract

Recent learning-based approaches have achieved impressive results in the field of single-shot camera localization. However, how best to fuse multiple modalities (e.g., image and depth) and to deal with degraded or missing input are less well studied. In particular, we note that previous approaches towards deep fusion do not perform significantly better than models employing a single modality. We conjecture that this is because of the naive approaches to feature space fusion through summation or concatenation which do not take into account the different strengths of each modality. To address this, we propose an end-to-end framework, termed VMLoc, to fuse different sensor inputs into a common latent space through a variational Product-of-Experts (PoE) followed by attention-based fusion. Unlike previous multimodal variational works directly adapting the objective function of vanilla variational auto-encoder, we show how camera localization can be accurately estimated through an unbiased objective function based on importance weighting. Our model is extensively evaluated on RGB-D datasets and the results prove the efficacy of our model. The source code is available at https://github.com/kaichen-z/VMLoc.

Code Repositories

kaichen-z/vmloc
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
visual-localization-on-oxford-radar-robotcarVMLoc
Mean Translation Error: 15.11

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
VMLoc: Variational Fusion For Learning-Based Multimodal Camera Localization | Papers | HyperAI