
摘要
本文研究了半监督视频对象分割任务,即在给定第一帧掩模的情况下,将视频中的对象从背景中分离出来。我们提出了基于全卷积神经网络架构的一次性视频对象分割(One-Shot Video Object Segmentation, OSVOS),该方法能够逐步将从ImageNet学习到的通用语义信息转移到前景分割任务上,最终学习测试序列中单个标注对象的外观(因此称为一次性)。尽管所有帧都是独立处理的,但结果在时间上具有一致性和稳定性。我们在两个标注的视频分割数据库上进行了实验,结果显示OSVOS不仅速度快,而且显著提升了现有技术水平(79.8% vs 68.0%)。
代码仓库
Mind23-2/MindCode-5/tree/main/OSVOS
mindspore
kmaninis/OSVOS-PyTorch
pytorch
MS-Mind/MS-Code-06/tree/main/OSVOS
mindspore
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| one-shot-visual-object-segmentation-on | OSVOS | F-Measure (Seen): 60.5 |
| semi-supervised-video-object-segmentation-on-1 | OSVOS | F-measure (Decay): 19.8 F-measure (Recall): 59.7 Ju0026F: 50.9 Jaccard (Decay): 19.2 Jaccard (Mean): 47.0 Jaccard (Recall): 52.1 |
| video-object-segmentation-on-youtube | OSVOS | mIoU: 0.783 |
| video-object-segmentation-on-youtube-vos | OSVOS | F-Measure (Seen): 60.5 F-Measure (Unseen): 60.7 Jaccard (Seen): 59.8 Jaccard (Unseen): 54.2 Overall: 58.8 Speed (FPS): 0.10 |
| visual-object-tracking-on-davis-2016 | OSVOS | F-measure (Decay): 15.0 F-measure (Mean): 80.6 F-measure (Recall): 92.6 Ju0026F: 80.2 Jaccard (Decay): 14.9 Jaccard (Mean): 79.8 Jaccard (Recall): 93.6 |
| visual-object-tracking-on-davis-2017 | OSVOS | F-measure (Decay): 27.0 F-measure (Mean): 63.9 F-measure (Recall): 73.8 Ju0026F: 60.25 Jaccard (Decay): 26.1 Jaccard (Mean): 56.6 Jaccard (Recall): 63.8 |
| visual-object-tracking-on-youtube-vos | OSVOS | F-Measure (Seen): 60.5 F-Measure (Unseen): 60.7 O (Average of Measures): 58.8 |