HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

OpenScene: 3D Scene Understanding with Open Vocabularies

Peng Songyou ; Genova Kyle ; Jiang Chiyu Max ; Tagliasacchi Andrea ; Pollefeys Marc ; Funkhouser Thomas

OpenScene: 3D Scene Understanding with Open Vocabularies

Abstract

Traditional 3D scene understanding approaches rely on labeled 3D datasets totrain a model for a single task with supervision. We propose OpenScene, analternative approach where a model predicts dense features for 3D scene pointsthat are co-embedded with text and image pixels in CLIP feature space. Thiszero-shot approach enables task-agnostic training and open-vocabulary queries.For example, to perform SOTA zero-shot 3D semantic segmentation it first infersCLIP features for every 3D point and later classifies them based onsimilarities to embeddings of arbitrary class labels. More interestingly, itenables a suite of open-vocabulary scene understanding applications that havenever been done before. For example, it allows a user to enter an arbitrarytext query and then see a heat map indicating which parts of a scene match. Ourapproach is effective at identifying objects, materials, affordances,activities, and room types in complex 3D scenes, all using a single modeltrained without any labeled 3D data.

Code Repositories

pengsongyou/openscene
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-open-vocabulary-instance-segmentation-on-1OpenScene + Mask3D
mAP: 10.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
OpenScene: 3D Scene Understanding with Open Vocabularies | Papers | HyperAI