HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

Fateh Amirreza Mohammadi Mohammad Reza Motlagh Mohammad Reza Jahed

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

Abstract

Few-shot Semantic Segmentation addresses the challenge of segmenting objects in query images with only a handful of annotated examples. However, many previous state-of-the-art methods either have to discard intricate local semantic features or suffer from high computational complexity. To address these challenges, we propose a new Few-shot Semantic Segmentation framework based on the Transformer architecture. Our approach introduces the spatial transformer decoder and the contextual mask generation module to improve the relational understanding between support and query images. Moreover, we introduce a multi scale decoder to refine the segmentation mask by incorporating features from different resolutions in a hierarchical manner. Additionally, our approach integrates global features from intermediate encoder stages to improve contextual understanding, while maintaining a lightweight structure to reduce complexity. This balance between performance and efficiency enables our method to achieve competitive results on benchmark datasets such as PASCAL-5^i and COCO-20^i in both 1-shot and 5-shot settings. Notably, our model with only 1.5 million parameters demonstrates competitive performance while overcoming limitations of existing methodologies. https://github.com/amirrezafateh/MSDNet

Code Repositories

amirrezafateh/msdnet
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
few-shot-semantic-segmentation-on-coco-20iMSDNet (ResNet-101)
Mean IoU: 73.9
few-shot-semantic-segmentation-on-coco-20iMSDNet (ResNet-50)
Mean IoU: 72.1
few-shot-semantic-segmentation-on-coco-20i-1MSDNet (ResNet-101)
FB-IoU: 71.3
Mean IoU: 48.5
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-coco-20i-1MSDNet (ResNet-50)
FB-IoU: 70.4
Mean IoU: 46.5
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-coco-20i-2MSDNet (ResNet-101)
Mean IoU: 76.4
few-shot-semantic-segmentation-on-coco-20i-2MSDNet (ResNet-50)
Mean IoU: 74.2
few-shot-semantic-segmentation-on-coco-20i-5MSDNet (ResNet-50)
FB-IoU: 74.5
Mean IoU: 54.5
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-coco-20i-5MSDNet (ResNet-101)
FB-IoU: 75.1
Mean IoU: 55.3
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-pascal-5i-1MSDNet (ResNet-101)
FB-IoU: 77.3
Mean IoU: 64.7
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-pascal-5i-1MSDNet (ResNet-50)
FB-IoU: 77.1
Mean IoU: 64.3
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-pascal-5i-5MSDNet (ResNet-50)
FB-IoU: 82.1
Mean IoU: 68.7
learnable parameters (million): 1.5
few-shot-semantic-segmentation-on-pascal-5i-5MSDNet (ResNet-101)
FB-IoU: 85.0
Mean IoU: 70.8
learnable parameters (million): 1.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp