HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

Lianghui Zhu Yingyue Li Jiemin Fang Yan Liu Hao Xin Wenyu Liu Xinggang Wang

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

Abstract

This paper explores the properties of the plain Vision Transformer (ViT) for Weakly-supervised Semantic Segmentation (WSSS). The class activation map (CAM) is of critical importance for understanding a classification network and launching WSSS. We observe that different attention heads of ViT focus on different image areas. Thus a novel weight-based method is proposed to end-to-end estimate the importance of attention heads, while the self-attention maps are adaptively fused for high-quality CAM results that tend to have more complete objects. Besides, we propose a ViT-based gradient clipping decoder for online retraining with the CAM results to complete the WSSS task. We name this plain Transformer-based Weakly-supervised learning framework WeakTr. It achieves the state-of-the-art WSSS performance on standard benchmarks, i.e., 78.4% mIoU on the val set of PASCAL VOC 2012 and 50.3% mIoU on the val set of COCO 2014. Code is available at https://github.com/hustvl/WeakTr.

Code Repositories

hustvl/weaktr
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
weakly-supervised-semantic-segmentation-onWeakTr (DeiT-S, multi-stage)
Mean IoU: 74.0
weakly-supervised-semantic-segmentation-onWeakTr (ViT-S, multi-stage)
Mean IoU: 78.4
weakly-supervised-semantic-segmentation-on-1WeakTr (DeiT-S, multi-stage)
Mean IoU: 74.1
weakly-supervised-semantic-segmentation-on-1WeakTr (ViT-S, multi-stage)
Mean IoU: 79.0
weakly-supervised-semantic-segmentation-on-14WeakTr (DeiT-S, single-stage)
Mean IoU: 76.5
weakly-supervised-semantic-segmentation-on-4WeakTr (ViT-S, multi-stage)
mIoU: 50.3
weakly-supervised-semantic-segmentation-on-4WeakTr (DeiT-S, multi-stage)
mIoU: 46.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation | Papers | HyperAI