Command Palette
Search for a command to run...
Extending global-local view alignment for self-supervised learning with remote sensing imagery
Xinye Wanyan Sachith Seneviratne Shuchang Shen Michael Kirley

Abstract
Since large number of high-quality remote sensing images are readily accessible, exploiting the corpus of images with less manual annotation draws increasing attention. Self-supervised models acquire general feature representations by formulating a pretext task that generates pseudo-labels for massive unlabeled data to provide supervision for training. While prior studies have explored multiple self-supervised learning techniques in remote sensing domain, pretext tasks based on local-global view alignment remain underexplored, despite achieving state-of-the-art results on natural imagery. Inspired by DINO, which employs an effective representation learning structure with knowledge distillation based on global-local view alignment, we formulate two pretext tasks for self-supervised learning on remote sensing imagery (SSLRS). Using these tasks, we explore the effectiveness of positive temporal contrast as well as multi-sized views on SSLRS. We extend DINO and propose DINO-MC which uses local views of various sized crops instead of a single fixed size in order to alleviate the limited variation in object size observed in remote sensing imagery. Our experiments demonstrate that even when pre-trained on only 10% of the dataset, DINO-MC performs on par or better than existing state-of-the-art SSLRS methods on multiple remote sensing tasks, while using less computational resources. All codes, models, and results are released at https://github.com/WennyXY/DINO-MC.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| change-detection-on-oscd-13ch | DINO-MC (WRN-50) | F1: 52.7 Precision: 49.99 |
| image-classification-on-eurosat | DINO-MC (WRN linear eval)) | Accuracy (%): 95.7 |
| image-classification-on-eurosat | DINO-MC (Wide ResNet) | Accuracy (%): 98.78 |
| multi-label-image-classification-on | DINO-MC | mAP (micro): 88.75 official split: No |
| multi-label-image-classification-on-1 | DINO-MC | mean average precision: 84.20 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.