HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

{Shi-Min Hu Miao Wang Meng-Hao Guo Yi Zhang}

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Abstract

CLIP has demonstrated marked progress in visual recognition due to its powerful pre-training on large-scale image-text pairs. However it still remains a critical challenge: how to transfer image-level knowledge into pixel-level understanding tasks such as semantic segmentation. In this paper to solve the mentioned challenge we analyze the gap between the capability of the CLIP model and the requirement of the zero-shot semantic segmentation task. Based on our analysis and observations we propose a novel method for zero-shot semantic segmentation dubbed CLIP-RC (CLIP with Regional Clues) bringing two main insights. On the one hand a region-level bridge is necessary to provide fine-grained semantics. On the other hand overfitting should be mitigated during the training stage. Benefiting from the above discoveries CLIP-RC achieves state-of-the-art performance on various zero-shot semantic segmentation benchmarks including PASCAL VOC PASCAL Context and COCO-Stuff 164K. Code will be available at https://github.com/Jittor/JSeg.

Benchmarks

BenchmarkMethodologyMetrics
zero-shot-semantic-segmentation-on-coco-stuffCLIP-RC
Inductive Setting hIoU: 41.2
Transductive Setting hIoU: 49.7
zero-shot-semantic-segmentation-on-pascal-vocCLIP-RC
Inductive Setting hIoU: 88.4
Transductive Setting hIoU: 93.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation | Papers | HyperAI