8 months ago

Object Tracking

Semantic Segmentation

Multi-Task Learning

Method/Architecture

Computer Vision

Lei Ke Xia Li Martin Danelljan Yu-Wing Tai Chi-Keung Tang Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, andsegmenting objects belonging to a set of given classes. Most approaches onlyexploit the temporal dimension to address the association problem, whilerelying on single frame predictions for the segmentation mask itself. Wepropose Prototypical Cross-Attention Network (PCAN), capable of leveraging richspatio-temporal information for online multiple object tracking andsegmentation. PCAN first distills a space-time memory into a set of prototypesand then employs cross-attention to retrieve rich information from the pastframes. To segment each object, PCAN adopts a prototypical appearance module tolearn a set of contrastive foreground and background prototypes, which are thenpropagated over time. Extensive experiments demonstrate that PCAN outperformscurrent video instance tracking and segmentation competition winners on bothYoutube-VIS and BDD100K datasets, and shows efficacy to both one-stage andtwo-stage segmentation frameworks. Code and video resources are available athttp://vis.xyz/pub/pcan.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Object Tracking

Semantic Segmentation

Multi-Task Learning

Method/Architecture

Computer Vision

Lei Ke Xia Li Martin Danelljan Yu-Wing Tai Chi-Keung Tang Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, andsegmenting objects belonging to a set of given classes. Most approaches onlyexploit the temporal dimension to address the association problem, whilerelying on single frame predictions for the segmentation mask itself. Wepropose Prototypical Cross-Attention Network (PCAN), capable of leveraging richspatio-temporal information for online multiple object tracking andsegmentation. PCAN first distills a space-time memory into a set of prototypesand then employs cross-attention to retrieve rich information from the pastframes. To segment each object, PCAN adopts a prototypical appearance module tolearn a set of contrastive foreground and background prototypes, which are thenpropagated over time. Extensive experiments demonstrate that PCAN outperformscurrent video instance tracking and segmentation competition winners on bothYoutube-VIS and BDD100K datasets, and shows efficacy to both one-stage andtwo-stage segmentation frameworks. Code and video resources are available athttp://vis.xyz/pub/pcan.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp