HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation

Hanrong Ye Dan Xu

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation

Abstract

This report serves as a supplementary document for TaskPrompter, detailing its implementation on a new joint 2D-3D multi-task learning benchmark based on Cityscapes-3D. TaskPrompter presents an innovative multi-task prompting framework that unifies the learning of (i) task-generic representations, (ii) task-specific representations, and (iii) cross-task interactions, as opposed to previous approaches that separate these learning objectives into different network modules. This unified approach not only reduces the need for meticulous empirical structure design but also significantly enhances the multi-task network's representation learning capability, as the entire model capacity is devoted to optimizing the three objectives simultaneously. TaskPrompter introduces a new multi-task benchmark based on Cityscapes-3D dataset, which requires the multi-task model to concurrently generate predictions for monocular 3D vehicle detection, semantic segmentation, and monocular depth estimation. These tasks are essential for achieving a joint 2D-3D understanding of visual scenes, particularly in the development of autonomous driving systems. On this challenging benchmark, our multi-task model demonstrates strong performance compared to single-task state-of-the-art methods and establishes new state-of-the-art results on the challenging 3D detection and depth estimation tasks.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
3d-object-detection-on-cityscapes-3dTaskPrompter
mDS: 32.94
monocular-depth-estimation-on-cityscapes-3dTaskPrompter
RMSE: 6.78
semantic-segmentation-on-cityscapes-3dTaskPrompter
mIoU: 77.72

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation | Papers | HyperAI