8 months ago

Image Classification

Semantic Segmentation

Method/Architecture

Computer Vision

Alkin Benedikt ; Miklautz Lukas ; Hochreiter Sepp ; Brandstetter Johannes

Abstract

We introduce MIM (Masked Image Modeling)-Refiner, a contrastive learningboost for pre-trained MIM models. MIM-Refiner is motivated by the insight thatstrong representations within MIM models generally reside in intermediatelayers. Accordingly, MIM-Refiner leverages multiple contrastive heads that areconnected to different intermediate layers. In each head, a modified nearestneighbor objective constructs semantic clusters that capture semanticinformation which improves performance on downstream tasks, includingoff-the-shelf and fine-tuning settings. The refinement process is short and simple - yet highly effective. Within afew epochs, we refine the features of MIM models from subpar tostate-of-the-art, off-the-shelf features. Refining a ViT-H, pre-trained withdata2vec 2.0 on ImageNet-1K, sets a new state-of-the-art in linear probing(84.7%) and low-shot classification among models that are pre-trained onImageNet-1K. MIM-Refiner efficiently combines the advantages of MIM and IDobjectives and compares favorably against previous state-of-the-art SSL modelson a variety of benchmarks such as low-shot classification, long-tailedclassification, clustering and semantic segmentation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Image Classification

Semantic Segmentation

Method/Architecture

Computer Vision

Alkin Benedikt ; Miklautz Lukas ; Hochreiter Sepp ; Brandstetter Johannes

Abstract

We introduce MIM (Masked Image Modeling)-Refiner, a contrastive learningboost for pre-trained MIM models. MIM-Refiner is motivated by the insight thatstrong representations within MIM models generally reside in intermediatelayers. Accordingly, MIM-Refiner leverages multiple contrastive heads that areconnected to different intermediate layers. In each head, a modified nearestneighbor objective constructs semantic clusters that capture semanticinformation which improves performance on downstream tasks, includingoff-the-shelf and fine-tuning settings. The refinement process is short and simple - yet highly effective. Within afew epochs, we refine the features of MIM models from subpar tostate-of-the-art, off-the-shelf features. Refining a ViT-H, pre-trained withdata2vec 2.0 on ImageNet-1K, sets a new state-of-the-art in linear probing(84.7%) and low-shot classification among models that are pre-trained onImageNet-1K. MIM-Refiner efficiently combines the advantages of MIM and IDobjectives and compares favorably against previous state-of-the-art SSL modelson a variety of benchmarks such as low-shot classification, long-tailedclassification, clustering and semantic segmentation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations | Papers | HyperAI