
摘要
本文聚焦于实时语义分割这一具有挑战性的任务。该任务虽在实际应用中具有广泛前景,但其核心难点在于如何大幅减少用于像素级标签推断的计算量。为此,我们提出了一种图像级联网络(ICNet),通过在合理的标签引导下引入多分辨率分支,以应对这一挑战。本文对所提出的框架进行了深入分析,并提出了级联特征融合单元,以实现快速且高质量的分割效果。所提出的系统在单张GPU上即可实现实时推理,在Cityscapes、CamVid和COCO-Stuff等具有挑战性的数据集上均取得了良好的评估结果。
代码仓库
victorpham1997/Local_ICNet
tf
GitHub 中提及
Mind23-2/MindCode-53
mindspore
GitHub 中提及
oandrienko/fast-semantic-segmentation
tf
GitHub 中提及
mattangus/fast-semantic-segmentation
tf
GitHub 中提及
liminn/icnet-pytorch
pytorch
GitHub 中提及
hellochick/ICNet-tensorflow
tf
GitHub 中提及
zh320/realtime-semantic-segmentation-pytorch
pytorch
GitHub 中提及
yfjcode/ICNet
mindspore
osmr/imgclsmob
mxnet
GitHub 中提及
daheyinyin/ICNet
mindspore
GitHub 中提及
lisilin013/ICNet-tensorflow-ros
tf
GitHub 中提及
GuangyanZhang/SCNN-Deeplabv3-bisenet-icnet
paddle
GitHub 中提及
Mind23-2/MindCode-3/tree/main/ICNet
mindspore
lyqcom/icnet
mindspore
GitHub 中提及
pooruss/ICNet-Paddle2.2.0rc
paddle
GitHub 中提及
Bigpingping97/ICNet
mindspore
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| dichotomous-image-segmentation-on-dis-te1 | ICNet | E-measure: 0.784 HCE: 234 MAE: 0.095 S-Measure: 0.716 max F-Measure: 0.631 weighted F-measure: 0.535 |
| dichotomous-image-segmentation-on-dis-te2 | ICNet | E-measure: 0.826 HCE: 512 MAE: 0.095 S-Measure: 0.759 max F-Measure: 0.716 weighted F-measure: 0.627 |
| dichotomous-image-segmentation-on-dis-te3 | ICNet | E-measure: 0.852 HCE: 1001 MAE: 0.091 S-Measure: 0.780 max F-Measure: 0.752 weighted F-measure: 0.664 |
| dichotomous-image-segmentation-on-dis-te4 | ICNet | E-measure: 0.837 HCE: 3690 MAE: 0.099 S-Measure: 0.776 max F-Measure: 0.749 weighted F-measure: 0.663 |
| dichotomous-image-segmentation-on-dis-vd | ICNet | E-measure: 0.811 HCE: 1503 MAE: 0.102 S-Measure: 0.747 max F-Measure: 0.697 weighted F-measure: 0.609 |
| real-time-semantic-segmentation-on-camvid | ICNet | Frame (fps): 27.8 Time (ms): 36 mIoU: 67.1% |
| real-time-semantic-segmentation-on-cityscapes | ICNet | Frame (fps): 30.3 Time (ms): 33 mIoU: 70.6% |
| semantic-segmentation-on-bdd100k-val | ICNet | mIoU: 52.4(39.5fps) |
| semantic-segmentation-on-cityscapes | ICNet | Mean IoU (class): 70.6% |
| semantic-segmentation-on-trans10k | ICNet | GFLOPs: 10.64 mIoU: 23.39% |