HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

Zhu Xiangyang ; Zhang Renrui ; He Bowei ; Guo Ziyu ; Zeng Ziyao ; Qin Zipeng ; Zhang Shanghang ; Gao Peng

PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

Abstract

Large-scale pre-trained models have shown promising open-world performancefor both vision and language tasks. However, their transferred capacity on 3Dpoint clouds is still limited and only constrained to the classification task.In this paper, we first collaborate CLIP and GPT to be a unified 3D open-worldlearner, named as PointCLIP V2, which fully unleashes their potential forzero-shot 3D classification, segmentation, and detection. To better align 3Ddata with the pre-trained language knowledge, PointCLIP V2 contains two keydesigns. For the visual end, we prompt CLIP via a shape projection module togenerate more realistic depth maps, narrowing the domain gap between projectedpoint clouds with natural images. For the textual end, we prompt the GPT modelto generate 3D-specific text as the input of CLIP's textual encoder. Withoutany training in 3D domains, our approach significantly surpasses PointCLIP by+42.90%, +40.44%, and +28.75% accuracy on three datasets for zero-shot 3Dclassification. On top of that, V2 can be extended to few-shot 3Dclassification, zero-shot 3D part segmentation, and 3D object detection in asimple manner, demonstrating our generalization ability for unified 3Dopen-world learning.

Code Repositories

zrrskywalker/pointclip
pytorch
Mentioned in GitHub
yangyangyang127/pointclip_v2
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-open-vocabulary-instance-segmentation-on-3PointCLIPV2
AP50: 03.1
training-free-3d-part-segmentation-onPointCLIP V2
Need 3D Data?: No
mIoU: 48.4
training-free-3d-point-cloud-classificationPointCLIP V2
Accuracy (%): 64.2
Need 3D Data?: No
training-free-3d-point-cloud-classification-1PointCLIP V2
Accuracy (%): 35.4
Need 3D Data?: No
zero-shot-transfer-3d-point-cloudPointCLIP V2
Accuracy (%): 64.22
zero-shot-transfer-3d-point-cloud-1PointCLIP V2
Accuracy (%): 73.13
zero-shot-transfer-3d-point-cloud-2PointCLIP V2
OBJ_BG Accuracy(%): 41.22
OBJ_ONLY Accuracy(%): 50.09
PB_T50_RS Accuracy (%): 35.36

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning | Papers | HyperAI