HyperAIHyperAI

Command Palette

Search for a command to run...

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

Jacek Komorowski Monika Wysoczańska Tomasz Trzcinski

Abstract

We introduce a discriminative multimodal descriptor based on a pair of sensorreadings: a point cloud from a LiDAR and an image from an RGB camera. Ourdescriptor, named MinkLoc++, can be used for place recognition, re-localizationand loop closure purposes in robotics or autonomous vehicles applications. Weuse late fusion approach, where each modality is processed separately and fusedin the final part of the processing pipeline. The proposed method achievesstate-of-the-art performance on standard place recognition benchmarks. We alsoidentify dominating modality problem when training a multimodal descriptor. Theproblem manifests itself when the network focuses on a modality with a largeroverfit to the training data. This drives the loss down during the training butleads to suboptimal performance on the evaluation set. In this work we describehow to detect and mitigate such risk when using a deep metric learning approachto train a multimodal neural network. Our code is publicly available on theproject website: https://github.com/jac99/MinkLocMultimodal.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp