HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SegViT: Semantic Segmentation with Plain Vision Transformers

Bowen Zhang Zhi Tian Quan Tang Xiangxiang Chu Xiaolin Wei Chunhua Shen Yifan Liu

SegViT: Semantic Segmentation with Plain Vision Transformers

Abstract

We explore the capability of plain Vision Transformers (ViTs) for semantic segmentation and propose the SegVit. Previous ViT-based segmentation networks usually learn a pixel-level representation from the output of the ViT. Differently, we make use of the fundamental component -- attention mechanism, to generate masks for semantic segmentation. Specifically, we propose the Attention-to-Mask (ATM) module, in which the similarity maps between a set of learnable class tokens and the spatial feature maps are transferred to the segmentation masks. Experiments show that our proposed SegVit using the ATM module outperforms its counterparts using the plain ViT backbone on the ADE20K dataset and achieves new state-of-the-art performance on COCO-Stuff-10K and PASCAL-Context datasets. Furthermore, to reduce the computational cost of the ViT backbone, we propose query-based down-sampling (QD) and query-based up-sampling (QU) to build a Shrunk structure. With the proposed Shrunk structure, the model can save up to $40\%$ computations while maintaining competitive performance.

Code Repositories

zbwxp/SegVit
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
semantic-segmentation-on-ade20k-valSegViT ViT-Large
mIoU: 55.2
semantic-segmentation-on-coco-stuff-testSegViT (ours)
mIoU: 50.3%
semantic-segmentation-on-pascal-contextSegViT (ours)
mIoU: 65.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp