
摘要
近期的图像操作定位和检测技术通常利用由噪声敏感滤波器(如SRM或Bayar卷积)产生的取证伪影和痕迹。本文展示了在这些方法中常用的不同的滤波器在揭示不同类型的操纵方面表现出色,并提供了互补的取证痕迹。因此,我们探讨了将这些滤波器的输出结合起来的方法,以利用其产生的伪影的互补性来进行图像操作定位和检测(IMLD)。我们评估了两种不同的组合方法:一种是从每个取证滤波器生成独立特征,然后将其融合(这被称为后期融合),另一种是在早期混合不同模态的输出并生成组合特征(这被称为早期融合)。我们将后者作为特征编码机制,并结合一种新的解码机制,该机制包括特征重新加权,从而构建了所提出的MMFusion架构。我们证明了MMFusion在图像操作定位和检测方面均取得了具有竞争力的性能,在多个图像和视频数据集上超越了现有最先进的模型。此外,我们还进一步研究了每种取证滤波器在MMFusion中对解决不同类型操纵问题的贡献,借鉴了最近的人工智能可解释性度量方法。
代码仓库
idt-iti/mmfusion-iml
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| image-manipulation-detection-on-casia-v1 | Early Fusion | AUC: .929 Balanced Accuracy: .845 |
| image-manipulation-detection-on-casia-v1 | Late Fusion | AUC: .930 Balanced Accuracy: .860 |
| image-manipulation-detection-on-cocoglide | Early Fusion | AUC: .755 Balanced Accuracy: .660 |
| image-manipulation-detection-on-cocoglide | Late Fusion | AUC: .760 Balanced Accuracy: .677 |
| image-manipulation-detection-on-columbia | Late Fusion | AUC: .977 Balanced Accuracy: .822 |
| image-manipulation-detection-on-columbia | Early Fusion | AUC: .996 Balanced Accuracy: .962 |
| image-manipulation-detection-on-coverage | Early Fusion | AUC: .839 Balanced Accuracy: .770 |
| image-manipulation-detection-on-coverage | Late Fusion | AUC: .792 Balanced Accuracy: .720 |
| image-manipulation-detection-on-dso-1 | Late Fusion | AUC: .958 Balanced Accuracy: .830 |
| image-manipulation-detection-on-dso-1 | Early Fusion | AUC: .966 Balanced Accuracy: .935 |
| image-manipulation-localization-on-casia-v1 | Early Fusion | Average Pixel F1(Fixed threshold): .784 |
| image-manipulation-localization-on-casia-v1 | Late Fusion | Average Pixel F1(Fixed threshold): .775 |
| image-manipulation-localization-on-cocoglide | Late Fusion | Average Pixel F1(Fixed threshold): .574 |
| image-manipulation-localization-on-cocoglide | Early Fusion | Average Pixel F1(Fixed threshold): .553 |
| image-manipulation-localization-on-columbia | Early Fusion | Average Pixel F1(Fixed threshold): .888 |
| image-manipulation-localization-on-columbia | Late Fusion | Average Pixel F1(Fixed threshold): .864 |
| image-manipulation-localization-on-coverage | Late Fusion | Average Pixel F1(Fixed threshold): .641 |
| image-manipulation-localization-on-coverage | Early Fusion | Average Pixel F1(Fixed threshold): .663 |
| image-manipulation-localization-on-dso-1 | Late Fusion | Average Pixel F1(Fixed threshold): .899 |
| image-manipulation-localization-on-dso-1 | Early Fusion | Average Pixel F1(Fixed threshold): .869 |