5 months ago

Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition

Berg Axel ; Oskarsson Magnus ; O'Connor Mark

Abstract

While the Transformer architecture has become ubiquitous in the machinelearning field, its adaptation to 3D shape recognition is non-trivial. Due toits quadratic computational complexity, the self-attention operator quicklybecomes inefficient as the set of input points grows larger. Furthermore, wefind that the attention mechanism struggles to find useful connections betweenindividual points on a global scale. In order to alleviate these problems, wepropose a two-stage Point Transformer-in-Transformer (Point-TnT) approach whichcombines local and global attention mechanisms, enabling both individual pointsand patches of points to attend to each other effectively. Experiments on shapeclassification show that such an approach provides more useful features fordownstream tasks than the baseline Transformer, while also being morecomputationally efficient. In addition, we also extend our method to featurematching for scene reconstruction, showing that it can be used in conjunctionwith existing scene reconstruction pipelines.

Code Repositories

axeber01/point-tnt

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
3d-point-cloud-classification-on-modelnet40	Point-TnT	Number of params: 3.9M Overall Accuracy: 92.6
3d-point-cloud-classification-on-scanobjectnn	Point-TnT	FLOPs: 1.19G Mean Accuracy: 81.0 Number of params: 3.9M Overall Accuracy: 83.5
point-cloud-registration-on-3dmatch-benchmark	DIP + Point-TnT	Feature Matching Recall: 96.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette