HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Minghan Li Shuai Li Wangmeng Xiang Lei Zhang

MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

Abstract

While impressive progress has been achieved, video instance segmentation (VIS) methods with per-clip input often fail on challenging videos with occluded objects and crowded scenes. This is mainly because instance queries in these methods cannot encode well the discriminative embeddings of instances, making the query-based segmenter difficult to distinguish those `hard' instances. To address these issues, we propose to mine discriminative query embeddings (MDQE) to segment occluded instances on challenging videos. First, we initialize the positional embeddings and content features of object queries by considering their spatial contextual information and the inter-frame object motion. Second, we propose an inter-instance mask repulsion loss to distance each instance from its nearby non-target instances. The proposed MDQE is the first VIS method with per-clip input that achieves state-of-the-art results on challenging videos and competitive performance on simple videos. In specific, MDQE with ResNet50 achieves 33.0\% and 44.5\% mask AP on OVIS and YouTube-VIS 2021, respectively. Code of MDQE can be found at \url{https://github.com/MinghanLi/MDQE_CVPR2023}.

Code Repositories

minghanli/mdqe_cvpr2023
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
video-instance-segmentation-on-ovis-1MDQE(SwinL)
AP50: 67.8
AP75: 44.3
APho: 21.6
APmo: 49.3
APso: 65.1
AR1: 18.3
AR10: 46.5
mask AP: 42.6
video-instance-segmentation-on-youtube-vis-1MDQE(Swin-L)
AP50: 84.9
AP75: 67.3
AR1: 53.5
AR10: 65.0
mask AP: 59.9
video-instance-segmentation-on-youtube-vis-2MDQE(Swin-L)
AP50: 80.7
AP75: 61.7
AR1: 45.4
AR10: 60.6
mask AP: 55.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos | Papers | HyperAI