HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Universal Instance Perception as Object Discovery and Retrieval

Yan Bin ; Jiang Yi ; Wu Jiannan ; Wang Dong ; Luo Ping ; Yuan Zehuan ; Lu Huchuan

Universal Instance Perception as Object Discovery and Retrieval

Abstract

All instance perception tasks aim at finding certain objects specified bysome queries such as category names, language expressions, and targetannotations, but this complete field has been split into multiple independentsubtasks. In this work, we present a universal instance perception model of thenext generation, termed UNINEXT. UNINEXT reformulates diverse instanceperception tasks into a unified object discovery and retrieval paradigm and canflexibly perceive different types of objects by simply changing the inputprompts. This unified formulation brings the following benefits: (1) enormousdata from different tasks and label vocabularies can be exploited for jointlytraining general instance-level representations, which is especially beneficialfor tasks lacking in training data. (2) the unified model isparameter-efficient and can save redundant computation when handling multipletasks simultaneously. UNINEXT shows superior performance on 20 challengingbenchmarks from 10 instance-level tasks including classical image-level tasks(object detection and instance segmentation), vision-and-language tasks(referring expression comprehension and segmentation), and six video-levelobject tracking tasks. Code is available athttps://github.com/MasterBin-IIAU/UNINEXT.

Code Repositories

MasterBin-IIAU/UNINEXT
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
described-object-detection-on-descriptionUNINEXT-large
Intra-scenario ABS mAP: 15.9
Intra-scenario FULL mAP: 17.9
Intra-scenario PRES mAP: 18.6
generalized-referring-expressionUNINEXT
N-acc.: 50.6
Precision@(F1=1, IoU≥0.5): 58.2
instance-segmentation-on-cocoUNINEXT-H
AP50: 76.2
AP75: 56.7
APL: 67.5
APM: 55.9
APS: 33.3
mask AP: 51.8
multi-object-tracking-and-segmentation-on-3UNINEXT-H
mMOTSA: 35.7
multiple-object-tracking-on-bdd100k-valUNINEXT-H
AssocA: -
TETA: -
mIDF1: 56.7
mMOTA: 44.2
object-detection-on-coco-minivalUNINEXT-H
AP50: 77.5
AP75: 66.7
APL: 75.3
APM: 64.8
APS: 45.1
box AP: 60.6
referring-expression-segmentation-on-davisUNINEXT-H
Ju0026F 1st frame: 72.5
referring-expression-segmentation-on-refcocoUNINEXT-H
Overall IoU: 82.19
referring-expression-segmentation-on-refcoco-3UNINEXT-H
Overall IoU: 72.47
referring-expression-segmentation-on-refcoco-4UNINEXT-H
Overall IoU: 76.42
referring-expression-segmentation-on-refcoco-5UNINEXT-H
Overall IoU: 66.22
referring-expression-segmentation-on-refer-1UNINEXT-H
F: 72.7
J: 67.6
Ju0026F: 70.1
video-instance-segmentation-on-ovis-1UNINEXT (ViT-H, Online)
AP50: 72.5
AP75: 52.2
mask AP: 49.0
video-instance-segmentation-on-ovis-1UNINEXT (ResNet-50, Online)
AP50: 55.5
AP75: 35.6
mask AP: 34.0
visual-object-tracking-on-lasotUNINEXT-L
AUC: 72.4
Normalized Precision: 80.7
Precision: 78.9
visual-object-tracking-on-lasotUNINEXT-H
AUC: 72.2
Normalized Precision: 80.8
Precision: 79.4
visual-object-tracking-on-lasot-extUNINEXT-H
AUC: 56.2
Normalized Precision: 63.8
Precision: 63.8
visual-object-tracking-on-trackingnetUNINEXT-H
Accuracy: 85.4
Normalized Precision: 89.0
Precision: 86.4
visual-tracking-on-tnl2kUNINEXT-H
AUC: 59.3
precision: 62.8
zero-shot-segmentation-on-segmentation-in-theUNINEXT
Mean AP: 42.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Universal Instance Perception as Object Discovery and Retrieval | Papers | HyperAI