HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

Xiaoyang Wu Yixing Lao Li Jiang Xihui Liu Hengshuang Zhao

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

Abstract

As a pioneering work exploring transformer architecture for 3D point cloud understanding, Point Transformer achieves impressive results on multiple highly competitive benchmarks. In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work. In particular, we first propose group vector attention, which is more effective than the previous version of vector attention. Inheriting the advantages of both learnable weight encoding and multi-head attention, we present a highly effective implementation of grouped vector attention with a novel grouped weight encoding layer. We also strengthen the position information for attention by an additional position encoding multiplier. Furthermore, we design novel and lightweight partition-based pooling methods which enable better spatial alignment and more efficient sampling. Extensive experiments show that our model achieves better performance than its predecessor and achieves state-of-the-art on several challenging 3D point cloud understanding benchmarks, including 3D point cloud segmentation on ScanNet v2 and S3DIS and 3D point cloud classification on ModelNet40. Our code will be available at https://github.com/Gofinge/PointTransformerV2.

Code Repositories

Pointcept/PointTransformerV2
Official
pytorch
Mentioned in GitHub
Pointcept/Pointcept
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-point-cloud-classification-on-modelnet40PTv2
Mean Accuracy: 91.6
Overall Accuracy: 94.2
3d-semantic-segmentation-on-nuscenesPTv2
mIoU: 82.6%
3d-semantic-segmentation-on-s3disPointTransformerV2
mIoU (Area-5): 71.6
3d-semantic-segmentation-on-scannet-1PTv2
Top-1 IoU: 0.427
Top-3 IoU: 0.665
3d-semantic-segmentation-on-semantickittiPTv2
test mIoU: 72.6%
val mIoU: 70.3%
lidar-semantic-segmentation-on-nuscenesPTv2
test mIoU: 0.826
val mIoU: 0.802
semantic-segmentation-on-s3dis-area5PTv2
Number of params: N/A
mAcc: 78.0
mIoU: 72.6
oAcc: 91.6
semantic-segmentation-on-scannetPTv2
test mIoU: 75.2
val mIoU: 75.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Point Transformer V2: Grouped Vector Attention and Partition-based Pooling | Papers | HyperAI