| TURTLE (CLIP + DINOv2) | 0.989 | 0.995 | - | 0.985 | - | Let Go of Your Labels with Unsupervised Transfer | |
| TEMI CLIP ViT-L (openai) | 0.932 | 0.969 | ViT-L | 0.926 | Train | Exploring the Limits of Deep Image Clustering using Pretrained Models | |
| TEMI DINO ViT-B | 0.885 | 0.94.5 | ViT-B | 0.886 | Train | Exploring the Limits of Deep Image Clustering using Pretrained Models | |
| DPAC | 0.866 | 0.934 | ResNet-34 | 0.87 | - | Deep Online Probability Aggregation Clustering | |
| SPICE-BPA | 0.866 | 0.933 | ResNet-18 | 0.870 | - | The Balanced-Pairwise-Affinities Feature Transform | |
| SeCu | 0.857 | 0.93 | ResNet-18 | 0.861 | Train | Stable Cluster Discrimination for Deep Clustering | |
| SPICE* | 0.836 | 0.918 | ResNet-18 | 0.850 | Train | SPICE: Semantic Pseudo-labeling for Image Clustering | |
| IMC-SwAV (Best) | 0.8 | 0.897 | ResNet-18 | 0.818 | Train | Information Maximization Clustering via Multi-View Self-Labelling | |
| IMC-SwAV (Avg+-) | 0.79 | 0.891 | ResNet-18 | 0.811 | Train | Information Maximization Clustering via Multi-View Self-Labelling | |
| TCL | 0.780 | 0.887 | ResNet-34 | 0.819 | Train | Twin Contrastive Learning for Online Clustering | |
| SCAN | 0.772 | 0.883 | ResNet-18 | 0.797 | Train | SCAN: Learning to Classify Images without Labels | |
| SCAN (Avg) | 0.758 | 0.876 | ResNet-18 | 0.787 | Train | SCAN: Learning to Classify Images without Labels | |