HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Simple-BEV: What Really Matters for Multi-Sensor BEV Perception?

Harley Adam W. ; Fang Zhaoyuan ; Li Jie ; Ambrus Rares ; Fragkiadaki Katerina

Simple-BEV: What Really Matters for Multi-Sensor BEV Perception?

Abstract

Building 3D perception systems for autonomous vehicles that do not rely onhigh-density LiDAR is a critical research problem because of the expense ofLiDAR systems compared to cameras and other sensors. Recent research hasdeveloped a variety of camera-only methods, where features are differentiably"lifted" from the multi-camera images onto the 2D ground plane, yielding a"bird's eye view" (BEV) feature representation of the 3D space around thevehicle. This line of work has produced a variety of novel "lifting" methods,but we observe that other details in the training setups have shifted at thesame time, making it unclear what really matters in top-performing methods. Wealso observe that using cameras alone is not a real-world constraint,considering that additional sensors like radar have been integrated into realvehicles for years already. In this paper, we first of all attempt to elucidatethe high-impact factors in the design and training protocol of BEV perceptionmodels. We find that batch size and input resolution greatly affectperformance, while lifting strategies have a more modest effect -- even asimple parameter-free lifter works well. Second, we demonstrate that radar datacan provide a substantial boost to performance, helping to close the gapbetween camera-only and LiDAR-enabled systems. We analyze the radar usagedetails that lead to good performance, and invite the community to re-considerthis commonly-neglected part of the sensor platform.

Code Repositories

valeoai/pointbev
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
bird-s-eye-view-semantic-segmentation-onSimple-BEV
IoU veh - 224x480 - No vis filter - 100x100 at 0.5: 36.9
IoU veh - 224x480 - Vis filter. - 100x100 at 0.5: 43.0
IoU veh - 448x800 - No vis filter - 100x100 at 0.5: 40.9
IoU veh - 448x800 - Vis filter. - 100x100 at 0.5: 46.6
bird-s-eye-view-semantic-segmentation-on-lyftSimple-BEV (EfficientNet-b4)
IoU vehicle - 224x480 - Long: 44.5
IoU vehicle - 224x480 - Short: 70.4
bird-s-eye-view-semantic-segmentation-on-lyftSimple-BEV (ResNet-50)
IoU vehicle - 224x480 - Long: 43.6
IoU vehicle - 224x480 - Short: 70.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Simple-BEV: What Really Matters for Multi-Sensor BEV Perception? | Papers | HyperAI