3 months ago

CrossMoCo: Multi-modal Momentum Contrastive Learning for Point Cloud

{Nizar Bouguila Zachary Patterson Sneha Paul}

Abstract

The point cloud is a 3D geometric data that lacks a specific structure and is permutation-invariant. The applications of point clouds have gained significant attention recently in the field of vision tasks. However, most existing works on point clouds utilize supervised learning on large labelled data, which are costly and laborious to collect. To this end, unsupervised learning, for example, self-supervised learning, has shown promising performance in various tasks of 2D computer vision and holds the potential in 3D computer vision applications. In this study, we introduce a novel selfsupervised method called CrossMoCo, which learns the representations of unlabelled point cloud data in a multi-modal setup that also utilizes the 2D rendered images of the point clouds. CrossMoCo outperforms existing methods on multimodal self-supervised learning on point cloud by introducing two new concepts: momentum contrastive learning with more negative samples and multiple-view intra-modal contrastive learning. The first component learns from an online encoder and a momentum encoder with a large number of negative samples, which provides consistent learning signals. The second component enforces consistency between different views of the samples of the same modality, thereby improving multimodal representation. We conduct extensive studies on two popular benchmark datasets (ModelNet40 and ScanObjectNN) for linear classification and few-shot learning tasks. Our results demonstrate that CrossMoCo achieves superior performance over existing methods for both tasks on both datasets, achieving up to 4.36% improvement on linear classification and up to 9.2% on few-shot tasks. Our code is available at https://github.com/snehaputul/CrossMoCo.

Benchmarks

Benchmark	Methodology	Metrics
3d-object-classification-on-modelnet40	CrossMoCo	Classification Accuracy: 91.49
3d-point-cloud-classification-on-modelnet40	CrossMoCo	Overall Accuracy: 91.49
3d-point-cloud-linear-classification-on	CrossMoCo	Overall Accuracy: 91.49
3d-point-cloud-linear-classification-on-1	CrossMoCo	Overall Accuracy: 86.06
few-shot-3d-point-cloud-classification-on-1	CrossMoCo	Overall Accuracy: 93.8 Standard Deviation: 4.5
few-shot-3d-point-cloud-classification-on-2	CrossMoCo	Overall Accuracy: 96.8 Standard Deviation: 1.7
few-shot-3d-point-cloud-classification-on-3	CrossMoCo	Overall Accuracy: 88.7 Standard Deviation: 3.9
few-shot-3d-point-cloud-classification-on-4	CrossMoCo	Overall Accuracy: 91.0 Standard Deviation: 3.4
few-shot-3d-point-cloud-classification-on-6	CrossMoCo	Overall Accuracy: 69.6
few-shot-3d-point-cloud-classification-on-7	CrossMoCo	Overall Accuracy: 78.1
few-shot-3d-point-cloud-classification-on-8	CrossMoCo	Overall Accuracy: 84.0
few-shot-3d-point-cloud-classification-on-9	CrossMoCo	Overall Accuracy: 87.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning