
摘要
近年来,得益于深度卷积神经网络的发展,图像显著性检测取得了显著进展。然而,将最先进的图像显著性检测方法拓展至视频领域仍面临挑战。视频中物体或相机的运动,以及外观对比度的剧烈变化,严重制约了显著目标检测的性能。本文提出了一种名为光流引导的循环神经编码器(Flow Guided Recurrent Neural Encoder, FGRNE)的精确且端到端学习框架,用于视频显著目标检测。该框架通过融合两方面信息来增强帧间特征的时序一致性:一是基于光流的运动信息,二是基于LSTM网络的序列特征演化编码。FGRNE可被视为一种通用框架,能够将任意基于全卷积网络(FCN)的静态图像显著性检测器无缝扩展至视频显著目标检测任务。大量实验结果验证了FGRNE各组成部分的有效性,并证实所提出方法在DAVIS和FBMS等公开基准数据集上的性能显著优于当前最先进的方法。
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| video-salient-object-detection-on-davis-2016 | FGRN | AVERAGE MAE: 0.043 MAX E-MEASURE: 0.917 MAX F-MEASURE: 0.783 S-Measure: 0.838 |
| video-salient-object-detection-on-davsod | FGRN | Average MAE: 0.095 S-Measure: 0.701 max E-Measure: 0.765 max F-Measure: 0.589 |
| video-salient-object-detection-on-davsod-1 | FGRN | Average MAE: 0.126 S-Measure: 0.638 max E-measure: 0.700 |
| video-salient-object-detection-on-davsod-2 | FGRN | Average MAE: 0.131 S-Measure: 0.608 max E-measure: 0.698 |
| video-salient-object-detection-on-fbms-59 | FGRN | AVERAGE MAE: 0.088 MAX E-MEASURE: 0.863 MAX F-MEASURE: 0.767 S-Measure: 0.809 |
| video-salient-object-detection-on-mcl | FGRN | AVERAGE MAE: 0.044 MAX E-MEASURE: 0.817 MAX F-MEASURE: 0.625 S-Measure: 0.709 |
| video-salient-object-detection-on-uvsd | FGRN | Average MAE: 0.042 S-Measure: 0.745 max E-measure: 0.887 |
| video-salient-object-detection-on-visal | FGRN | Average MAE: 0.045 S-Measure: 0.861 max E-measure: 0.945 |
| video-salient-object-detection-on-vos-t | FGRN | Average MAE: 0.097 S-Measure: 0.715 max E-measure: 0.797 |