HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

PointCLIP: Point Cloud Understanding by CLIP

Zhang Renrui ; Guo Ziyu ; Zhang Wei ; Li Kunchang ; Miao Xupeng ; Cui Bin ; Qiao Yu ; Gao Peng ; Li Hongsheng

PointCLIP: Point Cloud Understanding by CLIP

Abstract

Recently, zero-shot and few-shot learning via Contrastive Vision-LanguagePre-training (CLIP) have shown inspirational performance on 2D visualrecognition, which learns to match images with their corresponding texts inopen-vocabulary settings. However, it remains under explored that whether CLIP,pre-trained by large-scale image-text pairs in 2D, can be generalized to 3Drecognition. In this paper, we identify such a setting is feasible by proposingPointCLIP, which conducts alignment between CLIP-encoded point cloud and 3Dcategory texts. Specifically, we encode a point cloud by projecting it intomulti-view depth maps without rendering, and aggregate the view-wise zero-shotprediction to achieve knowledge transfer from 2D to 3D. On top of that, wedesign an inter-view adapter to better extract the global feature andadaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in2D. By just fine-tuning the lightweight adapter in the few-shot settings, theperformance of PointCLIP could be largely improved. In addition, we observe thecomplementary property between PointCLIP and classical 3D-supervised networks.By simple ensembling, PointCLIP boosts baseline's performance and evensurpasses state-of-the-art models. Therefore, PointCLIP is a promisingalternative for effective 3D point cloud understanding via CLIP under lowresource cost and data regime. We conduct thorough experiments onwidely-adopted ModelNet10, ModelNet40 and the challenging ScanObjectNN todemonstrate the effectiveness of PointCLIP. The code is released athttps://github.com/ZrrSkywalker/PointCLIP.

Code Repositories

zrrskywalker/pointclip
Official
pytorch
Mentioned in GitHub
pku-dair/hetu
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-open-vocabulary-instance-segmentation-on-3PointCLIP
AP50: 02.6
training-free-3d-part-segmentation-onPointCLIP
Need 3D Data?: No
mIoU: 31.0
training-free-3d-point-cloud-classificationPointCLIP
Accuracy (%): 20.2
Need 3D Data?: No
training-free-3d-point-cloud-classification-1PointCLIP
Accuracy (%): 15.4
Need 3D Data?: No
zero-shot-transfer-3d-point-cloudPointCLIP
Accuracy (%): 20.18
zero-shot-transfer-3d-point-cloud-1PointCLIP
Accuracy (%): 30.23
zero-shot-transfer-3d-point-cloud-2PointCLIP
OBJ_BG Accuracy(%): 21.34
OBJ_ONLY Accuracy(%): 19.28
PB_T50_RS Accuracy (%): 15.38

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
PointCLIP: Point Cloud Understanding by CLIP | Papers | HyperAI