Command Palette
Search for a command to run...
Foivos Ntelemis Yaochu Jin Spencer A. Thomas

Abstract
Image clustering is a particularly challenging computer vision task, which aims to generate annotations without human supervision. Recent advances focus on the use of self-supervised learning strategies in image clustering, by first learning valuable semantics and then clustering the image representations. These multiple-phase algorithms, however, increase the computational time and their final performance is reliant on the first stage. By extending the self-supervised approach, we propose a novel single-phase clustering method that simultaneously learns meaningful representations and assigns the corresponding annotations. This is achieved by integrating a discrete representation into the self-supervised paradigm through a classifier net. Specifically, the proposed clustering objective employs mutual information, and maximizes the dependency between the integrated discrete representation and a discrete probability distribution. The discrete probability distribution is derived though the self-supervised process by comparing the learnt latent representation with a set of trainable prototypes. To enhance the learning performance of the classifier, we jointly apply the mutual information across multi-crop views. Our empirical results show that the proposed framework outperforms state-of-the-art techniques with the average accuracy of 89.1% and 49.0%, respectively, on CIFAR-10 and CIFAR-100/20 datasets. Finally, the proposed method also demonstrates attractive robustness to parameter settings, making it ready to be applicable to other datasets.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-clustering-on-cifar-10 | IMC-SwAV (Best) | ARI: 0.8 Accuracy: 0.897 Backbone: ResNet-18 NMI: 0.818 Train set: Train |
| image-clustering-on-cifar-10 | IMC-SwAV (Avg+-) | ARI: 0.79 Accuracy: 0.891 Backbone: ResNet-18 NMI: 0.811 Train set: Train |
| image-clustering-on-cifar-100 | IMC-SwAV (Avg+-) | ARI: 0.337 Accuracy: 0.49 NMI: 0.503 |
| image-clustering-on-cifar-100 | IMC-SwAV (Best) | ARI: 0.361 Accuracy: 0.519 NMI: 0.527 Train Set: Train |
| image-clustering-on-stl-10 | IMC-SwAV (Best) | ARI: 0.716 Accuracy: 0.853 Backbone: ResNet-18 NMI: 0.747 Train Split: Train |
| image-clustering-on-stl-10 | IMC-SwAV (Avg+-) | ARI: 0.685 Accuracy: 0.831 Backbone: ResNet-18 NMI: 0.729 Train Split: Train |
| image-clustering-on-tiny-imagenet | IMC-SwAV (Best) | ARI: 0.146 Accuracy: 0.282 NMI: 0.526 |
| image-clustering-on-tiny-imagenet | IMC-SwAV (Avg+-) | ARI: 0.143 Accuracy: 0.279 NMI: 0.485 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.