4 个月前

MTP:通过多任务预训练推进遥感基础模型

MTP:通过多任务预训练推进遥感基础模型

摘要

基础模型通过提升各种图像解释任务,重塑了遥感(Remote Sensing, RS)领域的格局。预训练是一个活跃的研究课题,包括监督学习和自监督学习方法,以有效初始化模型权重。然而,将预训练模型迁移到下游任务时可能会遇到任务差异问题,这是由于预训练任务通常被设计为图像分类或对象区分任务所致。在本研究中,我们探索了多任务预训练(Multi-Task Pretraining, MTP)范式,以解决这一问题。我们采用共享编码器和任务特定解码器架构,在SAMRS数据集上进行了涵盖语义分割、实例分割和旋转目标检测的多任务监督预训练。MTP支持参数量超过3亿的卷积神经网络和视觉变换器基础模型。预训练模型在多种遥感下游任务上进行了微调,包括场景分类、水平和旋转目标检测、语义分割以及变化检测。广泛的实验结果表明,在14个数据集上的测试显示我们的模型优于现有类似规模的模型,并且其性能与更大规模的最先进模型相当,从而验证了MTP的有效性。

代码仓库

vitae-transformer/mtp
官方
pytorch
GitHub 中提及
cuzyoung/crossearth
pytorch
GitHub 中提及

基准测试

基准方法指标
building-change-detection-for-remote-sensingMAE+MTP(ViT-L+RVSA)
F1: 92.67
Params(M): 305
building-change-detection-for-remote-sensingMAE+MTP(ViT-B+RVSA)
F1: 92.22
Params(M): 86
building-change-detection-for-remote-sensingIMP+MTP(InternImage-XL)
F1: 92.54
Params(M): 335
change-detection-for-remote-sensing-images-onMAE+MTP(ViT-L+RVSA)
F1-Score: 0.9798
change-detection-for-remote-sensing-images-onMAE+MTP(ViT-B+RVSA)
F1-Score: 0.9787
change-detection-for-remote-sensing-images-onIMP+MTP(InternImage-XL)
F1-Score: 0.9833
change-detection-on-cdd-dataset-season-1MAE+MTP(ViT-B+RVSA)
F1-Score: 97.87
change-detection-on-cdd-dataset-season-1MAE+MTP(ViT-L+RVSA)
F1-Score: 97.98
change-detection-on-cdd-dataset-season-1IMP+MTP(InternImage-XL)
F1-Score: 98.33
change-detection-on-levir-cdMAE+MTP(ViT-L+RVSA)
F1: 92.67
change-detection-on-levir-cdIMP+MTP(InternImage-XL)
F1: 92.54
change-detection-on-levir-cdMAE+MTP(ViT-B+RVSA)
F1: 92.22
change-detection-on-oscd-3chMAE+MTP(ViT-B+RVSA)
F1: 53.36
change-detection-on-oscd-3chMAE+MTP(ViT-L+RVSA)
F1: 55.92
change-detection-on-oscd-3chIMP+MTP(InternImage-XL)
F1: 55.61
change-detection-on-whu-building-datasetMAE+MTP(ViT-L+RVSA)
F1-score: 0.9475
change-detection-on-whu-building-datasetMAE+MTP(ViT-B+RVSA)
F1-score: 0.9432
change-detection-on-whu-building-datasetIMP+MTP(InternImage-XL)
F1-score: 0.9559
image-classification-on-eurosatIMP+MTP(IntenImage-XL)
Accuracy (%): 99.24
image-classification-on-eurosatMAE+MTP(ViT-L+RVSA)
Accuracy (%): 98.78
image-classification-on-eurosatMAE+MTP(ViT-B+RVSA)
Accuracy (%): 98.76
object-detection-in-aerial-images-on-diorMAE+MTP(ViT-L+RVSA)
AP50: 81.1
object-detection-in-aerial-images-on-diorIMP+MTP(InternImage-XL)
AP50: 78.0
object-detection-in-aerial-images-on-diorMAE+MTP(ViT-B+RVSA)
AP50: 79.4
object-detection-in-aerial-images-on-dior-rMAE+MTP(ViT-L+RVSA)
mAP: 74.54
object-detection-in-aerial-images-on-dior-rMAE+MTP(ViT-B+RVSA)
mAP: 71.29
object-detection-in-aerial-images-on-dior-rIMP+MTP(InternImage-XL)
mAP: 72.17
object-detection-in-aerial-images-on-dota-1IMP+MTP(InternImage-XL)
mAP: 80.77%
object-detection-in-aerial-images-on-dota-1MAE+MTP(ViT-B+RVSA)
mAP: 80.67%
object-detection-in-aerial-images-on-dota-1MAE+MTP(ViT-L+RVSA)
mAP: 81.66%
object-detection-in-aerial-images-on-fair1m-2IMP+MTP(InternImage-XL)
mAP: 50.93
object-detection-in-aerial-images-on-fair1m-2MAE+MTP(ViT-B+RVSA)
mAP: 51.92
object-detection-in-aerial-images-on-fair1m-2MAE+MTP(ViT-L+RVSA)
mAP: 53.00
object-detection-in-aerial-images-on-xviewIMP+MTP(InternImage-XL)
AP50: 18.2
object-detection-in-aerial-images-on-xviewMAE+MTP(ViT-L+RVSA)
AP50: 19.4
object-detection-in-aerial-images-on-xviewMAE+MTP(ViT-B+RVSA)
AP50: 16.4
semantic-segmentation-on-lovedaMAE+MTP(ViT-L+RVSA)
Category mIoU: 54.17
semantic-segmentation-on-lovedaIMP+MTP(InternImage-XL)
Category mIoU: 54.17
semantic-segmentation-on-lovedaMAE+MTP(ViT-B+RVSA)
Category mIoU: 52.39
semantic-segmentation-on-spacenet-1MAE+MTP(ViT-L+RVSA)
Mean IoU: 79.54
semantic-segmentation-on-spacenet-1MAE+MTP(ViT-L)
Mean IoU: 79.69
semantic-segmentation-on-spacenet-1IMP+MTP(InternImage-XL)
Mean IoU: 79.16
semantic-segmentation-on-spacenet-1MAE+MTP(ViT-B+RVSA)
Mean IoU: 79.63

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
MTP:通过多任务预训练推进遥感基础模型 | 论文 | HyperAI超神经