
摘要
现有的大多数显著物体检测模型通过聚合卷积神经网络提取的多级特征取得了显著进展。然而,由于不同卷积层的感受野不同,这些层生成的特征之间存在较大差异。常见的特征融合策略(加法或拼接)忽略了这些差异,可能导致次优解。在本文中,我们提出了一种名为F3Net的模型来解决上述问题,该模型主要由交叉特征模块(Cross Feature Module, CFM)和级联反馈解码器(Cascaded Feedback Decoder, CFD)组成,并通过最小化一种新的像素位置感知损失(Pixel Position Aware Loss, PPA)进行训练。具体而言,CFM旨在选择性地聚合多级特征。与加法和拼接不同,CFM在融合前自适应地从输入特征中选择互补成分,从而有效避免引入过多可能破坏原始特征的冗余信息。此外,CFD采用了多阶段反馈机制,在此机制下,接近监督信号的特征将被引入到前一层的输出中以补充它们并消除特征之间的差异。这些经过精炼的特征将在生成最终显著图之前经历多次类似的迭代过程。进一步地,与二元交叉熵损失不同,所提出的PPA损失并不平等对待每个像素,而是可以综合像素的局部结构信息来引导网络更加关注局部细节。来自边界或易出错部分的困难像素将获得更多的关注以强调其重要性。F3Net能够准确分割显著物体区域并提供清晰的局部细节。在五个基准数据集上的全面实验表明,F3Net在六项评估指标上优于现有最先进的方法。
代码仓库
weijun88/F3Net
官方
pytorch
GitHub 中提及
StamatisKourkoutas/Brain_Tumor_Segmentation_US_Images
pytorch
GitHub 中提及
PanoAsh/SHD360
pytorch
GitHub 中提及
PanoAsh/ASOD60K
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| camouflaged-object-segmentation-on-pcod-1200 | F3Net | S-Measure: 0.885 |
| dichotomous-image-segmentation-on-dis-te1 | F3Net | E-measure: 0.783 HCE: 244 MAE: 0.095 S-Measure: 0.721 max F-Measure: 0.640 weighted F-measure: 0.549 |
| dichotomous-image-segmentation-on-dis-te2 | F3Net | E-measure: 0.820 HCE: 542 MAE: 0.097 S-Measure: 0.755 max F-Measure: 0.712 weighted F-measure: 0.620 |
| dichotomous-image-segmentation-on-dis-te3 | F3Net | E-measure: 0.848 HCE: 1059 MAE: 0.092 S-Measure: 0.773 max F-Measure: 0.743 weighted F-measure: 0.656 |
| dichotomous-image-segmentation-on-dis-te4 | F3Net | E-measure: 0.825 HCE: 3760 MAE: 0.107 S-Measure: 0.752 max F-Measure: 0.721 weighted F-measure: 0.633 |
| dichotomous-image-segmentation-on-dis-vd | F3Net | E-measure: 0.800 HCE: 1567 MAE: 0.107 S-Measure: 0.733 max F-Measure: 0.685 weighted F-measure: 0.595 |
| salient-object-detection-on-dut-omron-2 | F3Net | E-measure: 0.869 MAE: 0.052 S-measure: 0.838 max_F1: 0.813 |
| salient-object-detection-on-duts-te-1 | F3Net | E-measure: 0.901 MAE: 0.035 Smeasure: 0.888 max_F1: 0.891 |
| salient-object-detection-on-ecssd-1 | F3Net | E-measure: 0.927 MAE: 0.033 S-measure: 0.924 max_F1: 0.945 |
| salient-object-detection-on-hku-is-1 | F3Net | E-measure: 0.952 MAE: 0.028 S-measure: 0.917 max_F1: 0.936 |
| salient-object-detection-on-pascal-s-1 | F3Net | E-measure: 0.858 MAE: 0.061 S-measure: 0.854 max_F1: 0.871 |