Semantic Segmentation On Ade20K Val

评估指标

mIoU

评测结果

各个模型在此基准测试上的表现结果

Paper TitleRepository
BEiT-362.8Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
ViT-CoMer62.1ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions-
EVA61.5EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
FD-SwinV2-G61.4Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
OneFormer (InternImage-H, emb_dim=256, multi-scale, 896x896)60.8OneFormer: One Transformer to Rule Universal Image Segmentation
MaskDINO-SwinL60.8Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ViT-Adapter-L (Mask2Former, BEiT pretrain)60.5Vision Transformer Adapter for Dense Predictions
SERNet-Former_v259.35SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
OneFormer (DiNAT-L, multi-scale, 896x896)58.6OneFormer: One Transformer to Rule Universal Image Segmentation
ViT-Adapter-L (UperNet, BEiT pretrain)58.4Vision Transformer Adapter for Dense Predictions
RSSeg-ViT-L(BEiT pretrain)58.4Representation Separation for Semantic Segmentation with Vision Transformers-
OneFormer (DiNAT-L, multi-scale, 640x640)58.4OneFormer: One Transformer to Rule Universal Image Segmentation
OneFormer (Swin-L, multi-scale, 896x896)58.3OneFormer: One Transformer to Rule Universal Image Segmentation
SeMask (SeMask Swin-L FaPN-Mask2Former)58.2SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask (SeMask Swin-L MSFaPN-Mask2Former)58.2SeMask: Semantically Masked Transformers for Semantic Segmentation
DiNAT-L (Mask2Former)58.1Dilated Neighborhood Attention Transformer
Mask2Former (Swin-L-FaPN, multiscale)57.7Masked-attention Mask Transformer for Universal Image Segmentation
OneFormer (Swin-L, multi-scale, 640x640)57.7OneFormer: One Transformer to Rule Universal Image Segmentation
SeMask (SeMask Swin-L Mask2Former)57.5SeMask: Semantically Masked Transformers for Semantic Segmentation
SenFormer (BEiT-L)57.1Efficient Self-Ensemble for Semantic Segmentation
0 of 94 row(s) selected.
Semantic Segmentation On Ade20K Val | SOTA | HyperAI超神经