HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer

Runsheng Xu Hao Xiang Zhengzhong Tu Xin Xia Ming-Hsuan Yang Jiaqi Ma

V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer

Abstract

In this paper, we investigate the application of Vehicle-to-Everything (V2X) communication to improve the perception performance of autonomous vehicles. We present a robust cooperative perception framework with V2X communication using a novel vision Transformer. Specifically, we build a holistic attention model, namely V2X-ViT, to effectively fuse information across on-road agents (i.e., vehicles and infrastructure). V2X-ViT consists of alternating layers of heterogeneous multi-agent self-attention and multi-scale window self-attention, which captures inter-agent interaction and per-agent spatial relationships. These key modules are designed in a unified Transformer architecture to handle common V2X challenges, including asynchronous information sharing, pose errors, and heterogeneity of V2X components. To validate our approach, we create a large-scale V2X perception dataset using CARLA and OpenCDA. Extensive experimental results demonstrate that V2X-ViT sets new state-of-the-art performance for 3D object detection and achieves robust performance even under harsh, noisy environments. The code is available at https://github.com/DerrickXuNu/v2x-vit.

Code Repositories

DerrickXuNu/v2x-vit
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-object-detection-on-v2x-simV2X-ViT
mAOE: 0.383
mAP: 22.4
mASE: 0.250
mATE: 0.848
3d-object-detection-on-v2xsetV2X-ViT
AP0.5 (Noisy): 0.836
AP0.5 (Perfect): 0.882
AP0.7 (Noisy): 0.614
AP0.7 (Perfect): 0.712

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer | Papers | HyperAI