HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Point Transformer V3: Simpler, Faster, Stronger

Xiaoyang Wu Li Jiang Peng-Shuai Wang Zhijian Liu Xihui Liu Yu Qiao Wanli Ouyang Tong He Hengshuang Zhao

Point Transformer V3: Simpler, Faster, Stronger

Abstract

This paper is not motivated to seek innovation within the attention mechanism. Instead, it focuses on overcoming the existing trade-offs between accuracy and efficiency within the context of point cloud processing, leveraging the power of scale. Drawing inspiration from recent advances in 3D large-scale representation learning, we recognize that model performance is more influenced by scale than by intricate design. Therefore, we present Point Transformer V3 (PTv3), which prioritizes simplicity and efficiency over the accuracy of certain mechanisms that are minor to the overall performance after scaling, such as replacing the precise neighbor search by KNN with an efficient serialized neighbor mapping of point clouds organized with specific patterns. This principle enables significant scaling, expanding the receptive field from 16 to 1024 points while remaining efficient (a 3x increase in processing speed and a 10x improvement in memory efficiency compared with its predecessor, PTv2). PTv3 attains state-of-the-art results on over 20 downstream tasks that span both indoor and outdoor scenarios. Further enhanced with multi-dataset joint training, PTv3 pushes these results to a higher level.

Code Repositories

pointcept/pointtransformerv3
Official
pytorch
Mentioned in GitHub
Pointcept/Pointcept
Official
pytorch
Mentioned in GitHub
facebookresearch/SparseConvNet
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-semantic-segmentation-on-scannet-1PTv3
Top-1 IoU: 0.458
Top-3 IoU: 0.697
3d-semantic-segmentation-on-scannet-1PTv3 + PPT
Top-1 IoU: 0.464
Top-3 IoU: 0.710
3d-semantic-segmentation-on-scannet200PTv3 + PPT
test mIoU: 39.3
val mIoU: 36.0
3d-semantic-segmentation-on-semantickittiPPT+PTv3
test mIoU: 75.5%
val mIoU: 72.3%
lidar-semantic-segmentation-on-nuscenesPTv3 + PPT
test mIoU: 0.830
val mIoU: 0.812
semantic-segmentation-on-s3disPTv3 + PPT
Mean IoU: 80.8
Number of params: 24.1M
mAcc: 87.7
oAcc: 92.6
semantic-segmentation-on-s3dis-area5PTv3 + PPT
mAcc: 80.1
mIoU: 74.7
oAcc: 92.0
semantic-segmentation-on-scannetPTv3 + PPT
test mIoU: 79.4
val mIoU: 78.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Point Transformer V3: Simpler, Faster, Stronger | Papers | HyperAI