Command Palette
Search for a command to run...
Lorenzo Porzi; Samuel Rota Bulò; Aleksander Colovic; Peter Kontschieder

Abstract
In this work we introduce a novel, CNN-based architecture that can be trained end-to-end to deliver seamless scene segmentation results. Our goal is to predict consistent semantic segmentation and detection results by means of a panoptic output format, going beyond the simple combination of independently trained segmentation and detection models. The proposed architecture takes advantage of a novel segmentation head that seamlessly integrates multi-scale features generated by a Feature Pyramid Network with contextual information conveyed by a light-weight DeepLab-like module. As additional contribution we review the panoptic metric and propose an alternative that overcomes its limitations when evaluating non-instance categories. Our proposed network architecture yields state-of-the-art results on three challenging street-level datasets, i.e. Cityscapes, Indian Driving Dataset and Mapillary Vistas.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| panoptic-segmentation-on-indian-driving-1 | Seamless | PQ: 48.5 |
| panoptic-segmentation-on-kitti-panoptic-1 | Seamless | PQ: 42.2 |
| semantic-segmentation-on-densepass | Seamless (Mapillary) | mIoU: 34.14% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.