HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Ultra-Scalable Spectral Clustering and Ensemble Clustering

Huang Dong ; Wang Chang-Dong ; Wu Jian-Sheng ; Lai Jian-Huang ; Kwoh Chee-Keong

Ultra-Scalable Spectral Clustering and Ensemble Clustering

Abstract

This paper focuses on scalability and robustness of spectral clustering forextremely large-scale datasets with limited resources. Two novel algorithms areproposed, namely, ultra-scalable spectral clustering (U-SPEC) andultra-scalable ensemble clustering (U-SENC). In U-SPEC, a hybrid representativeselection strategy and a fast approximation method for K-nearestrepresentatives are proposed for the construction of a sparse affinitysub-matrix. By interpreting the sparse sub-matrix as a bipartite graph, thetransfer cut is then utilized to efficiently partition the graph and obtain theclustering result. In U-SENC, multiple U-SPEC clusterers are further integratedinto an ensemble clustering framework to enhance the robustness of U-SPEC whilemaintaining high efficiency. Based on the ensemble generation via multipleU-SEPC's, a new bipartite graph is constructed between objects and baseclusters and then efficiently partitioned to achieve the consensus clusteringresult. It is noteworthy that both U-SPEC and U-SENC have nearly linear timeand space complexity, and are capable of robustly and efficiently partitioningten-million-level nonlinearly-separable datasets on a PC with 64GB memory.Experiments on various large-scale datasets have demonstrated the scalabilityand robustness of our algorithms. The MATLAB code and experimental data areavailable at https://www.researchgate.net/publication/330760669.

Benchmarks

BenchmarkMethodologyMetrics
image-document-clustering-on-pendigitsU-SPEC
NMI: 0.803
runtime (s): 1.01

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Ultra-Scalable Spectral Clustering and Ensemble Clustering | Papers | HyperAI