HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Generative Prompt Model for Weakly Supervised Object Localization

Yuzhong Zhao; Qixiang Ye; Weijia Wu; Chunhua Shen; Fang Wan

Generative Prompt Model for Weakly Supervised Object Localization

Abstract

Weakly supervised object localization (WSOL) remains challenging when learning object localization models from image category labels. Conventional methods that discriminatively train activation models ignore representative yet less discriminative object parts. In this study, we propose a generative prompt model (GenPromp), defining the first generative pipeline to localize less discriminative object parts by formulating WSOL as a conditional image denoising procedure. During training, GenPromp converts image category labels to learnable prompt embeddings which are fed to a generative model to conditionally recover the input image with noise and learn representative embeddings. During inference, enPromp combines the representative embeddings with discriminative embeddings (queried from an off-the-shelf vision-language model) for both representative and discriminative capacity. The combined embeddings are finally used to generate multi-scale high-quality attention maps, which facilitate localizing full object extent. Experiments on CUB-200-2011 and ILSVRC show that GenPromp respectively outperforms the best discriminative models by 5.2% and 5.6% (Top-1 Loc), setting a solid baseline for WSOL with the generative model. Code is available at https://github.com/callsys/GenPromp.

Code Repositories

callsys/genpromp
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
weakly-supervised-object-localization-on-2Stable diffusion
GT-known localization accuracy: 75.0
Top-1 Localization Accuracy: 65.2
weakly-supervised-object-localization-on-cubGenPromp
Top-1 Localization Accuracy: 87.0
weakly-supervised-object-localization-on-cub-2Stable diffusion
GT-known localization accuracy: 98.0
Top-1 Localization Accuracy: 87.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Generative Prompt Model for Weakly Supervised Object Localization | Papers | HyperAI