3 months ago

Point-LGMask: Local and Global Contexts Embedding for Point Cloud Pre-training with Multi-Ratio Masking

{Min Chen Yixue Hao Long Hu Qiao Yu Jinfeng Xu Xianzhi Li Yuan Tang}

Abstract

Self-supervised learning has achieved great success in both natural language processing and 2D vision, where masked modeling is a quite popular pre-training scheme. However, extending masking to 3D point cloud understanding that combines local and global features poses a new challenge. In our work, we present Point-LGMask, a novel method to embed both local and global contexts with multi-ratio masking, which is quite effective for self-supervised feature learning of point clouds but is unfortunately ignored by existing pre-training works. Specifically, to avoid fitting to a fixed masking ratio, we first propose multi-ratio masking, which prompts the encoder to fully explore representative features thanks to tasks of different difficulties. Next, to encourage the embedding of both local and global features, we formulate a compound loss, which consists of (i) a global representation contrastive loss to encourage the cluster assignments of the masked point clouds to be consistent to that of the completed input, and (ii) a local point cloud prediction loss to encourage accurate prediction of masked points. Equipped with our Point-LGMask, we show that our learned representations transfer well to various downstream tasks, including few-shot classification, shape classification, object part segmentation, as well as real-world scene-based 3D object detection and 3D semantic segmentation. Particularly, our model largely advances existing pre-training methods on the difficult few-shot classification task using the real-captured ScanObjectNN dataset by surpassing over 4% to the second-best method. Also, our Point-LGMask achieves 0.4% AP25 and 0.8% AP50 gains on 3D object detection task over the second-best method. 0.4% mAcc and 0.5% mIoU. Codes have been released at https://github.com/TangYuan96/Point-LGMask

Benchmarks

Benchmark	Methodology	Metrics
3d-point-cloud-classification-on-scanobjectnn	Point-LGMask	OBJ-BG (OA): 89.8 OBJ-ONLY (OA): 89.3 Overall Accuracy: 85.3
few-shot-3d-point-cloud-classification-on-1	Point-LGMask	Overall Accuracy: 97.4 Standard Deviation: 2.0
few-shot-3d-point-cloud-classification-on-2	Point-LGMask	Overall Accuracy: 98.1 Standard Deviation: 1.4
few-shot-3d-point-cloud-classification-on-3	Point-LGMask	Overall Accuracy: 92.6 Standard Deviation: 4.3
few-shot-3d-point-cloud-classification-on-4	Point-LGMask	Overall Accuracy: 95.1 Standard Deviation: 3.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning