HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Hossain Mir Rayat Imtiaz ; Siam Mennatullah ; Sigal Leonid ; Little James J.

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale
  Approach

Abstract

The emergence of attention-based transformer models has led to theirextensive use in various tasks, due to their superior generalization andtransfer properties. Recent research has demonstrated that such models, whenprompted appropriately, are excellent for few-shot inference. However, suchtechniques are under-explored for dense prediction tasks like semanticsegmentation. In this work, we examine the effectiveness of prompting atransformer-decoder with learned visual prompts for the generalized few-shotsegmentation (GFSS) task. Our goal is to achieve strong performance not only onnovel categories with limited examples, but also to retain performance on basecategories. We propose an approach to learn visual prompts with limitedexamples. These learned visual prompts are used to prompt a multiscaletransformer decoder to facilitate accurate dense predictions. Additionally, weintroduce a unidirectional causal attention mechanism between the novelprompts, learned with limited examples, and the base prompts, learned withabundant data. This mechanism enriches the novel prompts without deterioratingthe base class performance. Overall, this form of prompting helps us achievestate-of-the-art performance for GFSS on two different benchmark datasets:COCO-$20^i$ and Pascal-$5^i$, without the need for test-time optimization (ortransduction). Furthermore, test-time optimization leveraging unlabelled testdata can be used to improve the prompts, which we refer to as transductiveprompt tuning.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
generalized-few-shot-semantic-segmentation-onVisualPromptGFSS
Mean Base and Novel: 58.11
generalized-few-shot-semantic-segmentation-on-1VisualPromptGFSS
Mean Base and Novel: 66.27
generalized-few-shot-semantic-segmentation-on-2VisualPromptGFSS
Mean Base and Novel: 36.05
generalized-few-shot-semantic-segmentation-on-3VisualPromptGFSS
Mean Base and Novel: 42.48

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach | Papers | HyperAI