Command Palette
Search for a command to run...
Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes
Hong Yuanduo ; Pan Huihui ; Sun Weichao ; Jia Yisong

Abstract
Semantic segmentation is a key technology for autonomous vehicles tounderstand the surrounding scenes. The appealing performances of contemporarymodels usually come at the expense of heavy computations and lengthy inferencetime, which is intolerable for self-driving. Using light-weight architectures(encoder-decoder or two-pathway) or reasoning on low-resolution images, recentmethods realize very fast scene parsing, even running at more than 100 FPS on asingle 1080Ti GPU. However, there is still a significant gap in performancebetween these real-time methods and the models based on dilation backbones. Totackle this problem, we proposed a family of efficient backbones speciallydesigned for real-time semantic segmentation. The proposed deep dual-resolutionnetworks (DDRNets) are composed of two deep branches between which multiplebilateral fusions are performed. Additionally, we design a new contextualinformation extractor named Deep Aggregation Pyramid Pooling Module (DAPPM) toenlarge effective receptive fields and fuse multi-scale context based onlow-resolution feature maps. Our method achieves a new state-of-the-arttrade-off between accuracy and speed on both Cityscapes and CamVid dataset. Inparticular, on a single 2080Ti GPU, DDRNet-23-slim yields 77.4% mIoU at 102 FPSon Cityscapes test set and 74.7% mIoU at 230 FPS on CamVid test set. Withwidely used test augmentation, our method is superior to most state-of-the-artmodels and requires much less computation. Codes and trained models areavailable online.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| all-day-semantic-segmentation-on-all-day | DDR-Net | mIoU: 68.6 |
| real-time-semantic-segmentation-on-camvid | DDRNet-23-slim | Frame (fps): 230(2080Ti) Time (ms): 4.3 mIoU: 74.7 |
| real-time-semantic-segmentation-on-camvid | DDRNet-23(Cityscapes-Pretrained) | Frame (fps): 94(2080Ti) Time (ms): 10.6 mIoU: 80.6 |
| real-time-semantic-segmentation-on-cityscapes | DDRNet-23-slim | Frame (fps): 101.6(2080Ti) Time (ms): 9.8 mIoU: 77.4% |
| real-time-semantic-segmentation-on-cityscapes-1 | DDRNet23-slim | Frame (fps): 101.6 mIoU: 77.4 |
| real-time-semantic-segmentation-on-cityscapes-1 | DDRNet23 | Frame (fps): 37.1 mIoU: 79.4 |
| semantic-segmentation-on-camvid | DDRNet23 | Mean IoU: 80.6% |
| semantic-segmentation-on-cityscapes | DDRNet-39 1.5x | Mean IoU (class): 82.4% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.