HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Understanding Dark Scenes by Contrasting Multi-Modal Observations

Xiaoyu Dong Naoto Yokoya

Understanding Dark Scenes by Contrasting Multi-Modal Observations

Abstract

Understanding dark scenes based on multi-modal image data is challenging, as both the visible and auxiliary modalities provide limited semantic information for the task. Previous methods focus on fusing the two modalities but neglect the correlations among semantic classes when minimizing losses to align pixels with labels, resulting in inaccurate class predictions. To address these issues, we introduce a supervised multi-modal contrastive learning approach to increase the semantic discriminability of the learned multi-modal feature spaces by jointly performing cross-modal and intra-modal contrast under the supervision of the class correlations. The cross-modal contrast encourages same-class embeddings from across the two modalities to be closer and pushes different-class ones apart. The intra-modal contrast forces same-class or different-class embeddings within each modality to be together or apart. We validate our approach on a variety of tasks that cover diverse light conditions and image modalities. Experiments show that our approach can effectively enhance dark scene understanding based on multi-modal images with limited semantics by shaping semantic-discriminative feature spaces. Comparisons with previous methods demonstrate our state-of-the-art performance. Code and pretrained models are available at https://github.com/palmdong/SMMCL.

Code Repositories

palmdong/smmcl
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
semantic-segmentation-on-llrgbd-syntheticSMMCL (SegFormer-B2)
mIoU: 67.77
semantic-segmentation-on-llrgbd-syntheticSMMCL (SegNeXt-B)
mIoU: 68.76
semantic-segmentation-on-llrgbd-syntheticSMMCL (ResNet-101)
mIoU: 64.40
semantic-segmentation-on-nyu-depth-v2SMMCL (ResNet-101)
Mean IoU: 52.5%
semantic-segmentation-on-nyu-depth-v2SMMCL (SegNeXt-B)
Mean IoU: 55.8%
semantic-segmentation-on-nyu-depth-v2SMMCL (SegFormer-B2)
Mean IoU: 53.7%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Understanding Dark Scenes by Contrasting Multi-Modal Observations | Papers | HyperAI