HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Siamese Network for RGB-D Salient Object Detection and Beyond

Keren Fu; Deng-Ping Fan; Ge-Peng Ji; Qijun Zhao; Jianbing Shen; Ce Zhu

Siamese Network for RGB-D Salient Object Detection and Beyond

Abstract

Existing RGB-D salient object detection (SOD) models usually treat RGB and depth as independent information and design separate networks for feature extraction from each. Such schemes can easily be constrained by a limited amount of training data or over-reliance on an elaborately designed training process. Inspired by the observation that RGB and depth modalities actually present certain commonality in distinguishing salient objects, a novel joint learning and densely cooperative fusion (JL-DCF) architecture is designed to learn from both RGB and depth inputs through a shared network backbone, known as the Siamese architecture. In this paper, we propose two effective components: joint learning (JL), and densely cooperative fusion (DCF). The JL module provides robust saliency feature learning by exploiting cross-modal commonality via a Siamese network, while the DCF module is introduced for complementary feature discovery. Comprehensive experiments using five popular metrics show that the designed framework yields a robust RGB-D saliency detector with good generalization. As a result, JL-DCF significantly advances the state-of-the-art models by an average of ~2.0% (max F-measure) across seven challenging datasets. In addition, we show that JL-DCF is readily applicable to other related multi-modal detection tasks, including RGB-T (thermal infrared) SOD and video SOD, achieving comparable or even better performance against state-of-the-art methods. We also link JL-DCF to the RGB-D semantic segmentation field, showing its capability of outperforming several semantic segmentation models on the task of RGB-D SOD. These facts further confirm that the proposed framework could offer a potential solution for various applications and provide more insight into the cross-modal complementarity task.

Code Repositories

kerenfu/JLDCF
Official
pytorch
Mentioned in GitHub
taozh2017/RGBD-SODsurvey
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
rgb-d-salient-object-detection-on-desJL-DCF*
Average MAE: 0.021
S-Measure: 93.6
max E-Measure: 97.5
max F-Measure: 92.9
rgb-d-salient-object-detection-on-nju2kJL-DCF*
Average MAE: 0.040
S-Measure: 91.1
max E-Measure: 94.8
max F-Measure: 91.3
rgb-d-salient-object-detection-on-nlprJL-DCF*
Average MAE: 0.023
S-Measure: 92.6
max E-Measure: 96.4
max F-Measure: 91.7
rgb-d-salient-object-detection-on-sipJL-DCF*
Average MAE: 0.046
S-Measure: 89.2
max E-Measure: 94.9
max F-Measure: 90.0
rgb-d-salient-object-detection-on-stereJL-DCF*
Average MAE: 0.039
S-Measure: 91.1
max E-Measure: 94.9
max F-Measure: 90.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Siamese Network for RGB-D Salient Object Detection and Beyond | Papers | HyperAI