HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation

Minhyun Lee Seungho Lee Song Park Dongyoon Han Byeongho Heo Hyunjung Shim

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image
  Segmentation

Abstract

Referring Image Segmentation (RIS) is an advanced vision-language task thatinvolves identifying and segmenting objects within an image as described byfree-form text descriptions. While previous studies focused on aligning visualand language features, exploring training techniques, such as dataaugmentation, remains underexplored. In this work, we explore effective dataaugmentation for RIS and propose a novel training framework called MaskedReferring Image Segmentation (MaskRIS). We observe that the conventional imageaugmentations fall short of RIS, leading to performance degradation, whilesimple random masking significantly enhances the performance of RIS. MaskRISuses both image and text masking, followed by Distortion-aware ContextualLearning (DCL) to fully exploit the benefits of the masking strategy. Thisapproach can improve the model's robustness to occlusions, incompleteinformation, and various linguistic complexities, resulting in a significantperformance improvement. Experiments demonstrate that MaskRIS can easily beapplied to various RIS models, outperforming existing methods in both fullysupervised and weakly supervised settings. Finally, MaskRIS achieves newstate-of-the-art performance on RefCOCO, RefCOCO+, and RefCOCOg datasets. Codeis available at https://github.com/naver-ai/maskris.

Code Repositories

naver-ai/maskris
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
referring-expression-segmentation-on-refcocoMaskRIS (Swin-B, combined DB)
Overall IoU: 78.71
referring-expression-segmentation-on-refcocoMaskRIS (Swin-B)
Mean IoU: 78.35
Overall IoU: 76.49
referring-expression-segmentation-on-refcoco-3MaskRIS (Swin-B, combined DB)
Overall IoU: 70.26
referring-expression-segmentation-on-refcoco-3MaskRIS (Swin-B)
Mean IoU: 71.68
Overall IoU: 67.54
referring-expression-segmentation-on-refcoco-4MaskRIS (Swin-B)
Mean IoU: 76.73
Overall IoU: 74.46
referring-expression-segmentation-on-refcoco-4MaskRIS (Swin-B, combined DB)
Overall IoU: 75.15
referring-expression-segmentation-on-refcoco-5MaskRIS (Swin-B)
Mean IoU: 64.5
Overall IoU: 59.39
referring-expression-segmentation-on-refcoco-5MaskRIS (Swin-B, combined DB)
Overall IoU: 62.83
referring-expression-segmentation-on-refcoco-8MaskRIS (Swin-B)
Mean IoU: 80.24
Overall IoU: 78.96
referring-expression-segmentation-on-refcoco-8MaskRIS (Swin-B, combined DB)
Overall IoU: 80.64
referring-expression-segmentation-on-refcoco-9MaskRIS (Swin-B)
Mean IoU: 76.06
Overall IoU: 73.96
referring-expression-segmentation-on-refcoco-9MaskRIS (Swin-B, combined DB)
Overall IoU: 75.1
referring-expression-segmentation-on-refcocogMaskRIS (Swin-B)
Mean IoU: 69.31
Overall IoU: 65.55
referring-expression-segmentation-on-refcocogMaskRIS (Swin-B, combined DB)
Overall IoU: 69.12
referring-expression-segmentation-on-refcocog-1MaskRIS (Swin-B)
Mean IoU: 69.42
Overall IoU: 66.5
referring-expression-segmentation-on-refcocog-1MaskRIS (Swin-B, combined DB)
Overall IoU: 71.09

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation | Papers | HyperAI