HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Choy Christopher ; Gwak JunYoung ; Savarese Silvio

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Abstract

In many robotics and VR/AR applications, 3D-videos are readily-availablesources of input (a continuous sequence of depth images, or LIDAR scans).However, those 3D-videos are processed frame-by-frame either through 2Dconvnets or 3D perception algorithms. In this work, we propose 4-dimensionalconvolutional neural networks for spatio-temporal perception that can directlyprocess such 3D-videos using high-dimensional convolutions. For this, we adoptsparse tensors and propose the generalized sparse convolution that encompassesall discrete convolutions. To implement the generalized sparse convolution, wecreate an open-source auto-differentiation library for sparse tensors thatprovides extensive functions for high-dimensional convolutional neuralnetworks. We create 4D spatio-temporal convolutional neural networks using thelibrary and validate them on various 3D semantic segmentation benchmarks andproposed 4D datasets for 3D-video perception. To overcome challenges in the 4Dspace, we propose the hybrid kernel, a special case of the generalized sparseconvolution, and the trilateral-stationary conditional random field thatenforces spatio-temporal consistency in the 7D space-time-chroma space.Experimentally, we show that convolutional neural networks with onlygeneralized 3D sparse convolutions can outperform 2D or 2D-3D hybrid methods bya large margin. Also, we show that on 3D-videos, 4D spatio-temporalconvolutional neural networks are robust to noise, outperform 3D convolutionalneural networks and are faster than the 3D counterpart in some cases.

Code Repositories

NVIDIA/MinkowskiEngine
pytorch
Mentioned in GitHub
shwoo93/minkowskiengine
pytorch
Mentioned in GitHub
buildingnet/buildingnet_dataset
pytorch
Mentioned in GitHub
ldkong1205/Robo3D
pytorch
Mentioned in GitHub
dkoh0207/lartpc_minkowski
pytorch
Mentioned in GitHub
StanfordVL/MinkowskiEngine
Official
pytorch
Mentioned in GitHub
mit-han-lab/spvnas
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-semantic-segmentation-on-scannet-1MinkowskiNet
Top-1 IoU: 0.292
Top-3 IoU: 0.531
3d-semantic-segmentation-on-scannet200MinkUNet
test mIoU: 25.3
val mIoU: 25.0
3d-semantic-segmentation-on-scribblekittiMinkowskiNet
mIoU: 55.0
3d-semantic-segmentation-on-stpls3dMinkowskiNet
mIOU: 51.3
robust-3d-semantic-segmentation-onMinkUNet-18
mean Corruption Error (mCE): 100.00%
robust-3d-semantic-segmentation-onMinkUNet-34
mean Corruption Error (mCE): 100.61%
robust-3d-semantic-segmentation-on-nuscenes-cMinkUNet-34
mean Corruption Error (mCE): 96.37%
robust-3d-semantic-segmentation-on-nuscenes-cMinkUNet-18
mean Corruption Error (mCE): 100.00%
robust-3d-semantic-segmentation-on-wod-cMinkUNet-18
mean Corruption Error (mCE): 100.00%
robust-3d-semantic-segmentation-on-wod-cMinkUNet-34
mean Corruption Error (mCE): 96.21%
semantic-segmentation-on-s3disMinkowskiNet
Mean IoU: 65.4
Number of params: 37.9M
Params (M): 37.9
semantic-segmentation-on-s3dis-area5MinkowskiNet
Number of params: 37.9M
mAcc: 71.7
mIoU: 65.4
semantic-segmentation-on-scannetMinkowskiNet
test mIoU: 73.4
val mIoU: 72.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp