Command Palette
Search for a command to run...
Gowda Shreyank N ; Eustratiadis Panagiotis ; Hospedales Timothy ; Sevilla-Lara Laura

Abstract
We consider the challenging problem of zero-shot video object segmentation(VOS). That is, segmenting and tracking multiple moving objects within a videofully automatically, without any manual initialization. We treat this as agrouping problem by exploiting object proposals and making a joint inferenceabout grouping over both space and time. We propose a network architecture fortractably performing proposal selection and joint grouping. Crucially, we thenshow how to train this network with reinforcement learning so that it learns toperform the optimal non-myopic sequence of grouping decisions to segment thewhole video. Unlike standard supervised techniques, this also enables us todirectly optimize for the non-differentiable overlap-based metrics used toevaluate VOS. We show that the proposed method, which we call ALBA outperformsthe previous stateof-the-art on three benchmarks: DAVIS 2017 [2], FBMS [20] andYoutube-VOS [27].
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| unsupervised-video-object-segmentation-on-4 | ALBA | F-measure (Mean): 60.2 F-measure (Recall): 63.1 Ju0026F: 58.4 Jaccard (Mean): 56.6 Jaccard (Recall): 63.4 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.