Command Palette
Search for a command to run...
Ping Hu Fabian Caba Heilbron Oliver Wang Zhe Lin Stan Sclaroff Federico Perazzi

Abstract
We present TDNet, a temporally distributed network designed for fast and accurate video semantic segmentation. We observe that features extracted from a certain high-level layer of a deep CNN can be approximated by composing features extracted from several shallower sub-networks. Leveraging the inherent temporal continuity in videos, we distribute these sub-networks over sequential frames. Therefore, at each time step, we only need to perform a lightweight computation to extract a sub-features group from a single sub-network. The full features used for segmentation are then recomposed by application of a novel attention propagation module that compensates for geometry deformation between frames. A grouped knowledge distillation loss is also introduced to further improve the representation power at both full and sub-feature levels. Experiments on Cityscapes, CamVid, and NYUD-v2 demonstrate that our method achieves state-of-the-art accuracy with significantly faster speed and lower latency.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| real-time-semantic-segmentation-on-camvid | TD4-PSP18 | Frame (fps): 25(TitanX) Time (ms): 40 mIoU: 72.6 |
| real-time-semantic-segmentation-on-camvid | TD2-PSP50 | Frame (fps): 11(TitanX) Time (ms): 90 mIoU: 76.0 |
| real-time-semantic-segmentation-on-cityscapes | TD4-BISE18 | Frame (fps): 47.6 (Titan X) Time (ms): 21 mIoU: 74.9% |
| real-time-semantic-segmentation-on-nyu-depth-1 | TD2-PSP50 | Speed(ms/f): 35 mIoU: 43.5 |
| real-time-semantic-segmentation-on-nyu-depth-1 | TD4-PSP18 | Speed(ms/f): 19 mIoU: 37.4 |
| semantic-segmentation-on-nyu-depth-v2 | TD2-PSP50 | Mean IoU: 43.5 |
| semantic-segmentation-on-nyu-depth-v2 | TD4-PSP18 | Mean IoU: 37.4 |
| semantic-segmentation-on-urbanlf | TDNet (ResNet-50) | mIoU (Real): 76.48 mIoU (Syn): 74.71 |
| video-semantic-segmentation-on-camvid | TDNet-50 | Mean IoU: 76.2 |
| video-semantic-segmentation-on-cityscapes-val | TDNet-50 [9] | mIoU: 79.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.