HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition

Fan Zhaoxin ; Song Zhenbo ; Liu Hongyan ; Lu Zhiwu ; He Jun ; Du Xiaoyong

SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale
  Place Recognition

Abstract

Point cloud-based large scale place recognition is fundamental for manyapplications like Simultaneous Localization and Mapping (SLAM). Although manymodels have been proposed and have achieved good performance by learningshort-range local features, long-range contextual properties have often beenneglected. Moreover, the model size has also become a bottleneck for their wideapplications. To overcome these challenges, we propose a super light-weightnetwork model termed SVT-Net for large scale place recognition. Specifically,on top of the highly efficient 3D Sparse Convolution (SP-Conv), an Atom-basedSparse Voxel Transformer (ASVT) and a Cluster-based Sparse Voxel Transformer(CSVT) are proposed to learn both short-range local features and long-rangecontextual features in this model. Consisting of ASVT and CSVT, SVT-Net canachieve state-of-the-art on benchmark datasets in terms of both accuracy andspeed with a super-light model size (0.9M). Meanwhile, two simplified versionsof SVT-Net are introduced, which also achieve state-of-the-art and furtherreduce the model size to 0.8M and 0.4M respectively.

Benchmarks

BenchmarkMethodologyMetrics
3d-place-recognition-on-oxford-robotcarSVT-Net
AR@1: 93.7
AR@1%: 97.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition | Papers | HyperAI