HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

Philion Jonah ; Fidler Sanja

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by
  Implicitly Unprojecting to 3D

Abstract

The goal of perception for autonomous vehicles is to extract semanticrepresentations from multiple sensors and fuse these representations into asingle "bird's-eye-view" coordinate frame for consumption by motion planning.We propose a new end-to-end architecture that directly extracts abird's-eye-view representation of a scene given image data from an arbitrarynumber of cameras. The core idea behind our approach is to "lift" each imageindividually into a frustum of features for each camera, then "splat" allfrustums into a rasterized bird's-eye-view grid. By training on the entirecamera rig, we provide evidence that our model is able to learn not only how torepresent images but how to fuse predictions from all cameras into a singlecohesive representation of the scene while being robust to calibration error.On standard bird's-eye-view tasks such as object segmentation and mapsegmentation, our model outperforms all baselines and prior work. In pursuit ofthe goal of learning dense representations for motion planning, we show thatthe representations inferred by our model enable interpretable end-to-endmotion planning by "shooting" template trajectories into a bird's-eye-view costmap output by our network. We benchmark our approach against models that useoracle depth from lidar. Project page with code:https://nv-tlabs.github.io/lift-splat-shoot .

Code Repositories

nv-tlabs/lift-splat-shoot
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
bird-s-eye-view-semantic-segmentation-onLift-Splat-Shoot
IoU ped - 224x480 - Vis filter. - 100x100 at 0.5: 15.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D | Papers | HyperAI