HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Depth Matters: Exploring Deep Interactions of RGB-D for Semantic Segmentation in Traffic Scenes

Siyu Chen Ting Han Changshe Zhang Weiquan Liu Jinhe Su Zongyue Wang Guorong Cai

Depth Matters: Exploring Deep Interactions of RGB-D for Semantic Segmentation in Traffic Scenes

Abstract

RGB-D has gradually become a crucial data source for understanding complex scenes in assisted driving. However, existing studies have paid insufficient attention to the intrinsic spatial properties of depth maps. This oversight significantly impacts the attention representation, leading to prediction errors caused by attention shift issues. To this end, we propose a novel learnable Depth interaction Pyramid Transformer (DiPFormer) to explore the effectiveness of depth. Firstly, we introduce Depth Spatial-Aware Optimization (Depth SAO) as offset to represent real-world spatial relationships. Secondly, the similarity in the feature space of RGB-D is learned by Depth Linear Cross-Attention (Depth LCA) to clarify spatial differences at the pixel level. Finally, an MLP Decoder is utilized to effectively fuse multi-scale features for meeting real-time requirements. Comprehensive experiments demonstrate that the proposed DiPFormer significantly addresses the issue of attention misalignment in both road detection (+7.5%) and semantic segmentation (+4.9% / +1.5%) tasks. DiPFormer achieves state-of-the-art performance on the KITTI (97.57% F-score on KITTI road and 68.74% mIoU on KITTI-360) and Cityscapes (83.4% mIoU) datasets.

Benchmarks

BenchmarkMethodologyMetrics
semantic-segmentation-on-kitti-360DiPFormer
mIoU: 68.74

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Depth Matters: Exploring Deep Interactions of RGB-D for Semantic Segmentation in Traffic Scenes | Papers | HyperAI