HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Learning Representation for Clustering via Prototype Scattering and Positive Sampling

Zhizhong Huang Jie Chen Junping Zhang Hongming Shan

Learning Representation for Clustering via Prototype Scattering and Positive Sampling

Abstract

Existing deep clustering methods rely on either contrastive or non-contrastive representation learning for downstream clustering task. Contrastive-based methods thanks to negative pairs learn uniform representations for clustering, in which negative pairs, however, may inevitably lead to the class collision issue and consequently compromise the clustering performance. Non-contrastive-based methods, on the other hand, avoid class collision issue, but the resulting non-uniform representations may cause the collapse of clustering. To enjoy the strengths of both worlds, this paper presents a novel end-to-end deep clustering method with prototype scattering and positive sampling, termed ProPos. Specifically, we first maximize the distance between prototypical representations, named prototype scattering loss, which improves the uniformity of representations. Second, we align one augmented view of instance with the sampled neighbors of another view -- assumed to be truly positive pair in the embedding space -- to improve the within-cluster compactness, termed positive sampling alignment. The strengths of ProPos are avoidable class collision issue, uniform representations, well-separated clusters, and within-cluster compactness. By optimizing ProPos in an end-to-end expectation-maximization framework, extensive experimental results demonstrate that ProPos achieves competing performance on moderate-scale clustering benchmark datasets and establishes new state-of-the-art performance on large-scale datasets. Source code is available at \url{https://github.com/Hzzone/ProPos}.

Code Repositories

hzzone/propos
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-clustering-on-imagenet-10ProPos*
ARI: 0.918
Accuracy: 0.962
Backbone: ResNet-34
Image Size: 224
NMI: 0.908
image-clustering-on-imagenet-10ProPos
ARI: 0.906
Accuracy: 0.956
Backbone: ResNet-34
Image Size: 96
NMI: 0.896
image-clustering-on-imagenet-dog-15ProPos*
ARI: 0.675
Accuracy: 0.775
Backbone: ResNet-34
Image Size: 224
NMI: 0.737
image-clustering-on-imagenet-dog-15ProPos
ARI: 0.627
Accuracy: 0.745
Backbone: ResNet-34
Image Size: 96
NMI: 0.692

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Learning Representation for Clustering via Prototype Scattering and Positive Sampling | Papers | HyperAI