HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

SUM: Saliency Unification through Mamba for Visual Attention Modeling

Alireza Hosseini; Amirhossein Kazerouni; Saeed Akhavan; Michael Brudno; Babak Taati

SUM: Saliency Unification through Mamba for Visual Attention Modeling

Abstract

Visual attention modeling, important for interpreting and prioritizing visual stimuli, plays a significant role in applications such as marketing, multimedia, and robotics. Traditional saliency prediction models, especially those based on Convolutional Neural Networks (CNNs) or Transformers, achieve notable success by leveraging large-scale annotated datasets. However, the current state-of-the-art (SOTA) models that use Transformers are computationally expensive. Additionally, separate models are often required for each image type, lacking a unified approach. In this paper, we propose Saliency Unification through Mamba (SUM), a novel approach that integrates the efficient long-range dependency modeling of Mamba with U-Net to provide a unified model for diverse image types. Using a novel Conditional Visual State Space (C-VSS) block, SUM dynamically adapts to various image types, including natural scenes, web pages, and commercial imagery, ensuring universal applicability across different data types. Our comprehensive evaluations across five benchmarks demonstrate that SUM seamlessly adapts to different visual characteristics and consistently outperforms existing models. These results position SUM as a versatile and powerful tool for advancing visual attention modeling, offering a robust solution universally applicable across different types of visual content.

Code Repositories

Arhosseini77/SUM
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
saliency-detection-on-cat2000SUM
AUC: 0.888
NSS: 2.423
saliency-prediction-on-cat2000SUM
KL: 0.27
saliency-prediction-on-mit300SUM
AUC-Judd: 0.913
CC: 0.768
KLD: 0.563
NSS: 2.839
SIM: 0.63
saliency-prediction-on-saleciSUM
KL: 0.473
saliency-prediction-on-saliconSUM
AUC: 0.876
CC: 0.909
KLD: 0.192
NSS: 1.981
SIM: 0.804

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
SUM: Saliency Unification through Mamba for Visual Attention Modeling | Papers | HyperAI