HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Scalable Spectral Clustering Using Random Binning Features

Wu Lingfei ; Chen Pin-Yu ; Yen Ian En-Hsu ; Xu Fangli ; Xia Yinglong ; Aggarwal Charu

Scalable Spectral Clustering Using Random Binning Features

Abstract

Spectral clustering is one of the most effective clustering approaches thatcapture hidden cluster structures in the data. However, it does not scale wellto large-scale problems due to its quadratic complexity in constructingsimilarity graphs and computing subsequent eigendecomposition. Although anumber of methods have been proposed to accelerate spectral clustering, most ofthem compromise considerable information loss in the original data for reducingcomputational bottlenecks. In this paper, we present a novel scalable spectralclustering method using Random Binning features (RB) to simultaneouslyaccelerate both similarity graph construction and the eigendecomposition.Specifically, we implicitly approximate the graph similarity (kernel) matrix bythe inner product of a large sparse feature matrix generated by RB. Then weintroduce a state-of-the-art SVD solver to effectively compute eigenvectors ofthis large matrix for spectral clustering. Using these two building blocks, wereduce the computational cost from quadratic to linear in the number of datapoints while achieving similar accuracy. Our theoretical analysis shows thatspectral clustering via RB converges faster to the exact spectral clusteringthan the standard Random Feature approximation. Extensive experiments on 8benchmarks show that the proposed method either outperforms or matches thestate-of-the-art methods in both accuracy and runtime. Moreover, our methodexhibits linear scalability in both the number of data samples and the numberof RB features.

Code Repositories

IBM/SpectralClustering_RandomBinning
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-document-clustering-on-pendigitsSC_RB
runtime (s): 1.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Scalable Spectral Clustering Using Random Binning Features | Papers | HyperAI