Command Palette
Search for a command to run...
Hierarchical Memory Matching Network for Video Object Segmentation
Hongje Seong Seoung Wug Oh Joon-Young Lee Seongwon Lee Suhyeon Lee Euntai Kim

Abstract
We present Hierarchical Memory Matching Network (HMMN) for semi-supervised video object segmentation. Based on a recent memory-based method [33], we propose two advanced memory read modules that enable us to perform memory reading in multiple scales while exploiting temporal smoothness. We first propose a kernel guided memory matching module that replaces the non-local dense memory read, commonly adopted in previous memory-based methods. The module imposes the temporal smoothness constraint in the memory read, leading to accurate memory retrieval. More importantly, we introduce a hierarchical memory matching scheme and propose a top-k guided memory matching module in which memory read on a fine-scale is guided by that on a coarse-scale. With the module, we perform memory read in multiple scales efficiently and leverage both high-level semantic and low-level fine-grained memory features to predict detailed object masks. Our network achieves state-of-the-art performance on the validation sets of DAVIS 2016/2017 (90.8% and 84.7%) and YouTube-VOS 2018/2019 (82.6% and 82.5%), and test-dev set of DAVIS 2017 (78.6%). The source code and model are available online: https://github.com/Hongje/HMMN.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| semi-supervised-video-object-segmentation-on-1 | HMMN | F-measure (Mean): 82.5 Ju0026F: 78.6 Jaccard (Mean): 74.7 |
| semi-supervised-video-object-segmentation-on-20 | HMMN | D16 val (F): 90.6 D16 val (G): 89.4 D16 val (J): 88.2 D17 val (F): 83.1 D17 val (G): 80.4 D17 val (J): 77.7 FPS: 10.0 |
| video-object-segmentation-on-youtube-vos | HMMN | F-Measure (Seen): 87.0 F-Measure (Unseen): 84.6 Jaccard (Seen): 82.1 Jaccard (Unseen): 76.8 Overall: 82.6 |
| visual-object-tracking-on-davis-2016 | HMMN | F-measure (Mean): 92.0 Ju0026F: 90.8 Jaccard (Mean): 89.6 |
| visual-object-tracking-on-davis-2017 | HMMN | F-measure (Mean): 87.5 Ju0026F: 84.7 Jaccard (Mean): 81.9 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.