3 个月前

SegFormer:基于Transformer的语义分割简单高效设计

SegFormer:基于Transformer的语义分割简单高效设计

摘要

我们提出SegFormer,一种简单、高效且强大的语义分割框架,该框架将Transformer与轻量级多层感知机(MLP)解码器相结合。SegFormer具有两个显著优势:1)SegFormer采用一种新型分层结构的Transformer编码器,能够输出多尺度特征表示。该设计无需位置编码(positional encoding),从而避免了在测试分辨率与训练分辨率不一致时,因位置编码插值导致的性能下降问题;2)SegFormer摒弃了复杂的解码器结构。所提出的MLP解码器能够融合来自不同层级的特征信息,同时整合局部注意力与全局注意力机制,从而生成具有强大表达能力的特征表示。我们证明,这种简洁而轻量的设计是实现Transformer在语义分割任务中高效运行的关键。我们进一步将该方法扩展为一系列模型,从SegFormer-B0到SegFormer-B5,其性能与效率均显著优于以往方法。例如,SegFormer-B4在ADE20K数据集上达到50.3%的mIoU(平均交并比),参数量仅为64M,相比此前最优方法体积缩小5倍,且性能提升2.2个百分点。我们性能最优的模型SegFormer-B5在Cityscapes验证集上取得了84.0%的mIoU,并在Cityscapes-C数据集上展现出优异的零样本鲁棒性。代码将发布于:github.com/NVlabs/SegFormer。

基准测试

基准方法指标
crack-segmentation-on-crackvision12kSegFormer
mIoU: 0.57969
semantic-segmentation-on-ade20kSegFormer-B5
Params (M): 84.7
Validation mIoU: 51.8
semantic-segmentation-on-ade20kSegFormer-B4
Params (M): 64.1
Validation mIoU: 51.1
semantic-segmentation-on-ade20kSegFormer-B0
Params (M): 3.8
Validation mIoU: 37.4
semantic-segmentation-on-ade20k-valSegFormer-B5(MS, 87M #Params, ImageNet-1K pretrain)
mIoU: 51.8
semantic-segmentation-on-cityscapesSegFormer (MiT-B5, Mapillary)
Mean IoU (class): 83.1%
semantic-segmentation-on-cityscapes-valSegFormer (MiT-B5, Mapillary)
mIoU: 84.0
semantic-segmentation-on-cityscapes-valSegFormer-B0
Validation mIoU: 76.2
semantic-segmentation-on-coco-stuff-fullSegFormer-B5 (Single Scale)
Mean IoU (class): 46.7
semantic-segmentation-on-dada-segSegFormer (MiT-B3)
mIoU: 27.0
semantic-segmentation-on-dada-segSegFormer (MiT-B1)
mIoU: 16.6
semantic-segmentation-on-dada-segSegFormer (MiT-B2)
mIoU: 21.2
semantic-segmentation-on-ddd17SegFormer-B2
mIoU: 71.05
semantic-segmentation-on-deliver-1SegFormer
mIoU: 57.20
semantic-segmentation-on-densepassSegFormer (MiT-B1)
mIoU: 38.5%
semantic-segmentation-on-densepassSegFormer (MiT-B2)
mIoU: 42.4%
semantic-segmentation-on-dsecSegFormer-B2
mIoU: 71.99
semantic-segmentation-on-eventscapeSegFormer-B2
mIoU: 58.69
semantic-segmentation-on-eventscapeSegFormer-B4
mIoU: 59.86
semantic-segmentation-on-potsdamSegFormer-B1
mIoU: 84.37
semantic-segmentation-on-potsdamSegFormer-B0
mIoU: 83.67
semantic-segmentation-on-potsdamSegFormer-B2
mIoU: 84.65
semantic-segmentation-on-selmaSegFormer
mIoU: 77.2
semantic-segmentation-on-spectralwasteSegFormer (HYPER3)
mIoU: 53.5
semantic-segmentation-on-spectralwasteSegFormer (HYPER)
mIoU: 54.3
semantic-segmentation-on-spectralwasteSegFormer (RGB)
mIoU: 48.4
semantic-segmentation-on-synpassSegFomrer
mIoU: 37.24%
semantic-segmentation-on-synthetic-bathingSegFormer
mIoU: 86.86
semantic-segmentation-on-uplightSegFormer-B2 (RGB)
mIoU: 89.60
semantic-segmentation-on-urbanlfSegFormer
mIoU (Real): 82.20
mIoU (Syn): 78.53
semantic-segmentation-on-us3dSegFormer-B0
mIoU: 71.80
semantic-segmentation-on-us3dSegFormer-B2
mIoU: 75.14
semantic-segmentation-on-us3dSegFormer-B1
mIoU: 74.19
semantic-segmentation-on-vaihingenSegFormer-B0
mIoU: 75.57
semantic-segmentation-on-vaihingenSegFormer-B2
mIoU: 76.69
semantic-segmentation-on-vaihingenSegFormer-B1
mIoU: 76.92
semantic-segmentation-on-zju-rgb-pSegFormer-B2 (RGB)
mIoU: 89.6
thermal-image-segmentation-on-mfn-datasetSegFormer (B2)
mIOU: 53.2
thermal-image-segmentation-on-mfn-datasetSegFormer (B4)
mIOU: 54.8
thermal-image-segmentation-on-rgb-t-glassSegFormer
MAE: 0.053

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
SegFormer:基于Transformer的语义分割简单高效设计 | 论文 | HyperAI超神经