HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering

Nakamura Kazumoto ; Nozawa Yuji ; Lin Yu-Chieh ; Nakata Kengo ; Ng Youyang

Improving Image Clustering with Artifacts Attenuation via Inference-Time
  Attention Engineering

Abstract

The goal of this paper is to improve the performance of pretrained VisionTransformer (ViT) models, particularly DINOv2, in image clustering task withoutrequiring re-training or fine-tuning. As model size increases, high-normartifacts anomaly appears in the patches of multi-head attention. We observethat this anomaly leads to reduced accuracy in zero-shot image clustering.These artifacts are characterized by disproportionately large values in theattention map compared to other patch tokens. To address these artifacts, wepropose an approach called Inference-Time Attention Engineering (ITAE), whichmanipulates attention function during inference. Specifically, we identify theartifacts by investigating one of the Query-Key-Value (QKV) patches in themulti-head attention and attenuate their corresponding attention values insidethe pretrained models. ITAE shows improved clustering accuracy on multipledatasets by exhibiting more expressive features in latent space. Our findingshighlight the potential of ITAE as a practical solution for reducing artifactsin pretrained ViT models and improving model performance in clustering taskswithout the need for re-training or fine-tuning.

Benchmarks

BenchmarkMethodologyMetrics
image-clustering-on-cifar-10ITAE
ARI: 0.7946
Accuracy: 0.8449
Backbone: ViT-B/14
NMI: 0.8682
Train set: Test
image-clustering-on-cifar-100ITAE
ARI: 0.5053
Accuracy: 0.6502
Backbone: ViT-B/14
NMI: 0.771
Train Set: Test
image-clustering-on-stl-10ITAE
ARI: 0.7594
Accuracy: 0.8276
Backbone: ViT-B/14
NMI: 0.8818
Train Split: Test
image-clustering-on-tiny-imagenetITAE
ARI: 0.5227
Accuracy: 0.6823
NMI: 0.8178

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering | Papers | HyperAI