Command Palette
Search for a command to run...
Wang Xudong ; Li Shufan ; Kallidromitis Konstantinos ; Kato Yusuke ; Kozuka Kazuki ; Darrell Trevor

Abstract
Open-vocabulary image segmentation aims to partition an image into semanticregions according to arbitrary text descriptions. However, complex visualscenes can be naturally decomposed into simpler parts and abstracted atmultiple levels of granularity, introducing inherent segmentation ambiguity.Unlike existing methods that typically sidestep this ambiguity and treat it asan external factor, our approach actively incorporates a hierarchicalrepresentation encompassing different semantic-levels into the learningprocess. We propose a decoupled text-image fusion mechanism and representationlearning modules for both "things" and "stuff". Additionally, we systematicallyexamine the differences that exist in the textual and visual features betweenthese types of categories. Our resulting model, named HIPIE, tacklesHIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within aunified framework. Benchmarked on over 40 datasets, e.g., ADE20K, COCO,Pascal-VOC Part, RefCOCO/RefCOCOg, ODinW and SeginW, HIPIE achieves thestate-of-the-art results at various levels of image comprehension, includingsemantic-level (e.g., semantic segmentation), instance-level (e.g.,panoptic/referring segmentation and object detection), as well as part-level(e.g., part/subpart segmentation) tasks. Our code is released athttps://github.com/berkeley-hipie/HIPIE.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-segmentation-on-pascal-panoptic-parts | HIPIE (ResNet-50) | mIoUPartS: 57.2 |
| image-segmentation-on-pascal-panoptic-parts | HIPIE (ViT-H) | mIoUPartS: 63.8 |
| panoptic-segmentation-on-coco-minival | HIPIE (ViT-H, single-scale) | PQ: 58.1 mIoU: 66.8 |
| referring-expression-segmentation-on-refcoco | HIPIE | Overall IoU: 82.8 |
| referring-expression-segmentation-on-refcoco-3 | HIPIE | Overall IoU: 73.9 |
| zero-shot-segmentation-on-segmentation-in-the | HIPIE | Mean AP: 41.6 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.