HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

ODIN: A Single Model for 2D and 3D Segmentation

Jain Ayush ; Katara Pushkal ; Gkanatsios Nikolaos ; Harley Adam W. ; Sarch Gabriel ; Aggarwal Kriti ; Chaudhary Vishrav ; Fragkiadaki Katerina

ODIN: A Single Model for 2D and 3D Segmentation

Abstract

State-of-the-art models on contemporary 3D segmentation benchmarks likeScanNet consume and label dataset-provided 3D point clouds, obtained throughpost processing of sensed multiview RGB-D images. They are typically trainedin-domain, forego large-scale 2D pre-training and outperform alternatives thatfeaturize the posed RGB-D multiview images instead. The gap in performancebetween methods that consume posed images versus post-processed 3D point cloudshas fueled the belief that 2D and 3D perception require distinct modelarchitectures. In this paper, we challenge this view and propose ODIN(Omni-Dimensional INstance segmentation), a model that can segment and labelboth 2D RGB images and 3D point clouds, using a transformer architecture thatalternates between 2D within-view and 3D cross-view information fusion. Ourmodel differentiates 2D and 3D feature operations through the positionalencodings of the tokens involved, which capture pixel coordinates for 2D patchtokens and 3D coordinates for 3D feature tokens. ODIN achieves state-of-the-artperformance on ScanNet200, Matterport3D and AI2THOR 3D instance segmentationbenchmarks, and competitive performance on ScanNet, S3DIS and COCO. Itoutperforms all previous works by a wide margin when the sensed 3D point cloudis used in place of the point cloud sampled from 3D mesh. When used as the 3Dperception engine in an instructable embodied agent architecture, it sets a newstate-of-the-art on the TEACh action-from-dialogue benchmark. Our code andcheckpoints can be found at the project website (https://odin-seg.github.io).

Code Repositories

ayushjain1144/odin
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
3d-instance-segmentation-on-scannet200ODIN
mAP: 31.5
mAP@25: 53.1
mAP@50: 45.3
3d-instance-segmentation-on-scannetv2ODIN
mAP: 50.0
mAP @ 50: 71.0
mAP@25: 83.6
3d-semantic-segmentation-on-scannet200ODIN
test mIoU: 36.8
val mIoU: 40.5
semantic-segmentation-on-scannetODIN
test mIoU: 74.4
val mIoU: 77.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
ODIN: A Single Model for 2D and 3D Segmentation | Papers | HyperAI