
摘要
我们提出了一种新颖的实时语义分割网络,其中编码器不仅对输入进行编码,还生成解码器的参数(权重)。此外,为了实现最大的自适应性,每个解码器块的权重在空间上都是变化的。为此,我们设计了一种新的超网络,该网络由一个嵌套的U-Net组成,用于提取高层次的上下文特征;一个多头权重生成模块,在每个解码器块使用前即时生成其权重,以实现高效的内存利用;以及一个主要网络,该网络包含创新的动态分片卷积。尽管使用了不太传统的模块,我们的架构仍能实现实时性能。在运行时间与精度的权衡方面,我们在流行的语义分割基准测试中超越了现有最先进(SotA)的结果:PASCAL VOC 2012(验证集)和Cityscapes及CamVid上的实时语义分割。代码已公开:https://nirkin.com/hyperseg。
代码仓库
YuvalNirkin/hyperseg
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| dichotomous-image-segmentation-on-dis-te1 | HySM | E-measure: 0.803 HCE: 205 MAE: 0.082 S-Measure: 0.761 max F-Measure: 0.695 weighted F-measure: 0.597 |
| dichotomous-image-segmentation-on-dis-te2 | HySM | E-measure: 0.832 HCE: 451 MAE: 0.085 S-Measure: 0.794 max F-Measure: 0.759 weighted F-measure: 0.667 |
| dichotomous-image-segmentation-on-dis-te3 | HySM | E-measure: 0.857 HCE: 887 MAE: 0.079 S-Measure: 0.811 max F-Measure: 0.792 weighted F-measure: 0.701 |
| dichotomous-image-segmentation-on-dis-te4 | HySM | E-measure: 0.842 HCE: 3331 MAE: 0.091 S-Measure: 0.802 max F-Measure: 0.782 weighted F-measure: 0.693 |
| dichotomous-image-segmentation-on-dis-vd | HySM | E-measure: 0.814 HCE: 1324 MAE: 0.096 S-Measure: 0.773 max F-Measure: 0.734 weighted F-measure: 0.640 |
| real-time-semantic-segmentation-on-camvid | HyperSeg-S | Frame (fps): 38.0 Time (ms): 26.3 mIoU: 78.4 |
| real-time-semantic-segmentation-on-camvid | HyperSeg-L | Frame (fps): 16.6 Time (ms): 60.2 mIoU: 79.1 |
| real-time-semantic-segmentation-on-cityscapes | HyperSeg-M | Frame (fps): 36.9 Time (ms): 27.1 mIoU: 75.8% |
| semantic-segmentation-on-pascal-voc-2012-val | HyperSeg-L | mIoU: 80.61% |