Command Palette
Search for a command to run...
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition
Berg Axel ; Oskarsson Magnus ; O'Connor Mark

Abstract
While the Transformer architecture has become ubiquitous in the machinelearning field, its adaptation to 3D shape recognition is non-trivial. Due toits quadratic computational complexity, the self-attention operator quicklybecomes inefficient as the set of input points grows larger. Furthermore, wefind that the attention mechanism struggles to find useful connections betweenindividual points on a global scale. In order to alleviate these problems, wepropose a two-stage Point Transformer-in-Transformer (Point-TnT) approach whichcombines local and global attention mechanisms, enabling both individual pointsand patches of points to attend to each other effectively. Experiments on shapeclassification show that such an approach provides more useful features fordownstream tasks than the baseline Transformer, while also being morecomputationally efficient. In addition, we also extend our method to featurematching for scene reconstruction, showing that it can be used in conjunctionwith existing scene reconstruction pipelines.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-point-cloud-classification-on-modelnet40 | Point-TnT | Number of params: 3.9M Overall Accuracy: 92.6 |
| 3d-point-cloud-classification-on-scanobjectnn | Point-TnT | FLOPs: 1.19G Mean Accuracy: 81.0 Number of params: 3.9M Overall Accuracy: 83.5 |
| point-cloud-registration-on-3dmatch-benchmark | DIP + Point-TnT | Feature Matching Recall: 96.8 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.