HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

{Kin-Man Lam Jianbing Shen Wenguan Wang Sanyuan Zhao Hongmei Song}

Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection

Abstract

This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM). A Pyramid Dilated Convolution (PDC) module is first designed for simultaneously extracting spatial features at multiple scales. These spatial features are then concatenated and fed into an extended Deeper Bidirectional ConvLSTM (DB-ConvLSTM) to learn spatiotemporal information. Forward and backward ConvLSTM units are placed in two layers and connected in a cascaded way, encouraging information flow between the bi-directional streams and leading to deeper feature extraction. We further augment DB-ConvLSTM with a PDC-like structure, by adopting several dilated DB-ConvLSTMs to extract multi-scale spatiotemporal information. Extensive experimental results show that our method outperforms previous video saliency models in a large margin, with a real-time speed of 20 fps on a single GPU. With unsupervised video object segmentation as an example application, the proposed model (with a CRF-based post-process) achieves state-of-the-art results on two popular benchmarks, well demonstrating its superior performance and high applicability.

Benchmarks

BenchmarkMethodologyMetrics
unsupervised-video-object-segmentation-on-10PDB
F: 74.5
G: 75.9
J: 77.2
unsupervised-video-object-segmentation-on-11PDB
J: 74.0
unsupervised-video-object-segmentation-on-12PDB
J: 65.5
unsupervised-video-object-segmentation-on-4PDB
F-measure (Mean): 57.0
F-measure (Recall): 60.2
Ju0026F: 55.1
Jaccard (Mean): 53.2
Jaccard (Recall): 58.9
unsupervised-video-object-segmentation-on-5PDB
F-measure (Decay): 3.7
F-measure (Mean): 43.0
F-measure (Recall): 44.6
Ju0026F: 40.4
Jaccard (Decay): 4.0
Jaccard (Mean): 37.7
Jaccard (Recall): 42.6
video-salient-object-detection-on-davis-2016PDB
AVERAGE MAE: 0.028
MAX E-MEASURE: 0.951
S-Measure: 0.882
video-salient-object-detection-on-davsodPDB
Average MAE: 0.114
S-Measure: 0.706
max E-Measure: 0.749
max F-Measure: 0.591
video-salient-object-detection-on-davsod-1PDB
Average MAE: 0.132
S-Measure: 0.649
max E-measure: 0.698
video-salient-object-detection-on-davsod-2PDB
Average MAE: 0.107
S-Measure: 0.608
max E-measure: 0.678
video-salient-object-detection-on-fbms-59PDB
AVERAGE MAE: 0.064
MAX F-MEASURE: 0.821
S-Measure: 0.851
video-salient-object-detection-on-mclPDB
AVERAGE MAE: 0.021
MAX E-MEASURE: 0.911
S-Measure: 0.856
video-salient-object-detection-on-segtrack-v2PDB
AVERAGE MAE: 0.024
S-Measure: 0.864
max E-measure: 0.935
video-salient-object-detection-on-uvsdPDB
Average MAE: 0.018
S-Measure: 0.901
max E-measure: 0.975
video-salient-object-detection-on-visalPDB
Average MAE: 0.032
S-Measure: 0.907
max E-measure: 0.846
video-salient-object-detection-on-vos-tPDB
Average MAE: 0.078
S-Measure: 0.818
max E-measure: 0.837

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp