HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

Xiankai Lu; Wenguan Wang; Chao Ma; Jianbing Shen; Ling Shao; Fatih Porikli

See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks

Abstract

We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our network provide efficient and competent stages for capturing global correlations and scene context by jointly computing and appending co-attention responses into a joint feature space. We train COSNet with pairs of video frames, which naturally augments training data and allows increased learning capacity. During the segmentation stage, the co-attention model encodes useful information by processing multiple reference frames together, which is leveraged to infer the frequently reappearing and salient foreground objects better. We propose a unified and end-to-end trainable framework where different co-attention variants can be derived for mining the rich context within videos. Our extensive experiments over three large benchmarks manifest that COSNet outperforms the current alternatives by a large margin.

Code Repositories

carrierlxk/COSNet
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
unsupervised-video-object-segmentation-on-10COSNet
F: 79.4
G: 80.0
J: 80.5
unsupervised-video-object-segmentation-on-11COSNet
J: 75.6
unsupervised-video-object-segmentation-on-12COSNet
J: 70.5
video-polyp-segmentation-on-sun-seg-easyCOSNet
Dice: 0.596
S measure: 0.654
Sensitivity: 0.359
mean E-measure: 0.600
mean F-measure: 0.496
weighted F-measure: 0.431
video-polyp-segmentation-on-sun-seg-hardCOSNet
Dice: 0.606
S-Measure: 0.670
Sensitivity: 0.380
mean E-measure: 0.627
mean F-measure: 0.506
weighted F-measure: 0.443

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
See More, Know More: Unsupervised Video Object Segmentation with Co-Attention Siamese Networks | Papers | HyperAI