Command Palette
Search for a command to run...
Self-positioning Point-based Transformer for Point Cloud Understanding
Park Jinyoung ; Lee Sanghyeok ; Kim Sihyeon ; Xiong Yunyang ; Kim Hyunwoo J.

Abstract
Transformers have shown superior performance on various computer vision taskswith their capabilities to capture long-range dependencies. Despite thesuccess, it is challenging to directly apply Transformers on point clouds dueto their quadratic cost in the number of points. In this paper, we present aSelf-Positioning point-based Transformer (SPoTr), which is designed to captureboth local and global shape contexts with reduced complexity. Specifically,this architecture consists of local self-attention and self-positioningpoint-based global cross-attention. The self-positioning points, adaptivelylocated based on the input shape, consider both spatial and semanticinformation with disentangled attention to improve expressive power. With theself-positioning points, we propose a novel global cross-attention mechanismfor point clouds, which improves the scalability of global self-attention byallowing the attention module to compute attention weights with only a smallset of self-positioning points. Experiments show the effectiveness of SPoTr onthree point cloud tasks such as shape classification, part segmentation, andscene segmentation. In particular, our proposed model achieves an accuracy gainof 2.6% over the previous best models on shape classification withScanObjectNN. We also provide qualitative analyses to demonstrate theinterpretability of self-positioning points. The code of SPoTr is available athttps://github.com/mlvlab/SPoTr.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-part-segmentation-on-shapenet-part | SPoTr | Class Average IoU: 85.4 Instance Average IoU: 87.2 |
| 3d-point-cloud-classification-on-scanobjectnn | SPoTr | Mean Accuracy: 86.8 Overall Accuracy: 88.6 |
| semantic-segmentation-on-s3dis-area5 | SPoTr | Number of params: N/A mAcc: 76.4 mIoU: 70.8 oAcc: 90.7 |
| supervised-only-3d-point-cloud-classification | SPoTr | GFLOPs: 10.8 Number of params (M): 1.7 Overall Accuracy (PB_T50_RS): 88.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.