HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Segment Every Reference Object in Spatial and Temporal Spaces

{Ping Luo Zehuan Yuan Huchuan Lu Bin Yan Yi Jiang Jiannan Wu}

Segment Every Reference Object in Spatial and Temporal Spaces

Abstract

The reference-based object segmentation tasks, namely referring image segmentation (RIS), referring video object segmentation (RVOS), and video object segmentation (VOS), aim to segment a specific object by utilizing either language or annotated masks as references. Despite significant progress in each respective field, current methods are task-specifically designed and developed in different directions, which hinders the activation of multi-task capabilities for these tasks. In this work, we end the current fragmented situation and propose UniRef to unify the three reference-based object segmentation tasks with a single architecture. At the heart of our approach is the multiway-fusion for handling different task with respect to their specified references. And a unified Transformer architecture is then adopted for performing instance-level segmentation. With the unified designs, UniRef can be jointly trained on a broad range of benchmarks and can flexibly perform multiple tasks at runtime by specifying the corresponding references. We evaluate the jointly trained network on various benchmarks. Extensive experimental results indicate that our proposed UniRef achieves state-of-the-art performance on RIS and RVOS, and performs competitively on VOS with a single network.

Benchmarks

BenchmarkMethodologyMetrics
referring-expression-segmentation-on-refer-1UniRef-L (Swin-L)
F: 69.2
J: 65.5
Ju0026F: 67.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Segment Every Reference Object in Spatial and Temporal Spaces | Papers | HyperAI