Command Palette
Search for a command to run...
Maksim Kolodiazhnyi Anna Vorontsova Matvey Skripkin Danila Rukhovich Anton Konushin

Abstract
Growing customer demand for smart solutions in robotics and augmented realityhas attracted considerable attention to 3D object detection from point clouds.Yet, existing indoor datasets taken individually are too small andinsufficiently diverse to train a powerful and general 3D object detectionmodel. In the meantime, more general approaches utilizing foundation models arestill inferior in quality to those based on supervised training for a specifictask. In this work, we propose , a simple yet effective 3D objectdetection model, which is trained on a mixture of indoor datasets and iscapable of working in various indoor environments. By unifying different labelspaces, enables learning a strong representation across multipledatasets through a supervised joint training scheme. The proposed networkarchitecture is built upon a vanilla transformer encoder, making it easy torun, customize and extend the prediction pipeline for practical use. Extensiveexperiments demonstrate that obtains significant gains over existing 3Dobject detection methods in 6 indoor benchmarks: ScanNet (+1.1 mAP50),ARKitScenes (+19.4 mAP25), S3DIS (+9.1 mAP50), MultiScan (+9.3 mAP50), 3RScan(+3.2 mAP50), and ScanNet++ (+2.7 mAP50). Code is available athttps://github.com/filapro/unidet3d .
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-object-detection-on-3rscan | UniDet3D | mAP@0.25: 64.7 mAP@0.5: 48.6 |
| 3d-object-detection-on-arkitscenes | UniDet3D | mAP@0.25: 61.3 mAP@0.5: 47.1 |
| 3d-object-detection-on-multiscan | UniDet3D | mAP@0.25: 64.2 mAP@0.5: 51.6 |
| 3d-object-detection-on-s3dis | UniDet3D | mAP@0.25: 75.2 mAP@0.5: 60.8 |
| 3d-object-detection-on-scannet-1 | UniDet3D | mAP@0.25: 26.4 mAP@0.5: 17.2 |
| 3d-object-detection-on-scannetv2 | UniDet3D | mAP@0.25: 77.5 mAP@0.5: 66.1 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.