Fast modified global k-means algorithm for incremental cluster construction

Fast modified global k-means algorithm for incremental cluster construction Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05

Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

Motivation • The global k-means algorithm and the modified global k-means algorithm are incremental clustering algorithms. • allow one to find global or a near global minimizer of the cluster (or error) function. • However, these algorithms are memory demanding • they require the storage of the affinity matrix . • Alternatively, this matrix can be computed at each iteration, however, this extends thecomputational time significantly.

Objectives • A new version of the modified global k-means algorithm is proposed: • apply an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. • the best solutionis selected as a starting point for the next cluster center. • information gathered in previous iterations of the incremental algorithm to avoid computing the whole affinity matrix. • the triangle inequality for distances is used to avoid unnecessary computations

Methodology • Modified global k-means algorithm [1] • Starts with the computation ofone cluster centerand attempts tooptimally add one new cluster center at each iteration. • An auxiliary cluster function • using k-1 cluster centers from the(k-1)-th iteration • to compute the starting point for the k-th center. • The k-means algorithm is applied starting from this point to find the k-partition of the dataset. • Fast modified global k-means algorithm • auxiliary cluster function to generate a set of starting points • the best solution is selected • avoid computing the whole affinity matrix

Modified global k-means algorithm cluster function

Modified global k-means algorithm : the solution to the(k-1)-partition problem Auxiliary cluster function: ｙＳ(y) ｘ１ｘ３ｘ２

Modified global k-means algorithm

ｙＳ1.0(y) ｘ１ｘ３ｘ２ Fast modified global k-means algorithm u=0.2 u=1.0 ｙＳ0.2(y) ｘ１ｘ３ｘ２

Reduction of computational effort Ｓ(ai) ai aj ｘ１

Reduction of computational effort

Computational complexity • The modified global k-means algorithm • O(mk2T+km2+kmt) • The fast modified global k-means algorithm • O(p(mk2T+km2+kmt)) (without complexity reduction schemes) • O(p(mk2T+km1 2+km1t)) (with complexity reduction schemes) T the number of iterations by Algorithm 2 tthenumberofiterationsbyAlgorithm1 m1the number of data points in the set P(u)∩A and m1<<m.

k the number of clusters fopt the best known value of the cluster function × m E the error in %, α the number of Euclidean norm evaluations t the CPU time Numericalexperiments

fFMGKM/fGKM fFMGKM/fMGKM α CPU time

The Dunn’s validity index The Davies–Bouldin cluster validity measure • Show a similar pattern. • Generate similar cluster structures

Conclusions • Developed a new version of the modified global k-means algorithm • Using the k-1 cluster centers from the previous iteration to solve the k-partition problem. • does not rely on the affinity matrix to compute the starting point • use more than one starting point to minimize the auxiliary function • Two schemes to reduce the amount of computational effort • no guarantee that it will converge to the global solution.

Comments • Advantages • Schemes to avoid computational effort. • Shortages • Determine the set U is not easy. • Applications • clustering

Fast modified global k-means algorithm for incremental cluster construction

Fast modified global k-means algorithm for incremental cluster construction

Presentation Transcript

K-means algorithm

K-Means

K-means algorithm

K-means and Fuzzy K-means

Cluster Means- Percentages

Modified global k-means algorithm for minimum sum-of-squares clustering problems

Fast Algorithm for String Matching with k Mismatches

Fast global k-means clustering using cluster membership and inequality

K-means algorithm

A Fast Algorithm for Incremental Distance Calculation

A Genetic Algorithm Approach to K -Means Clustering

Rek-means A k-means Based Clustering Algorithm

K Means Clustering , Nearest Cluster and Gaussian Mixture

K-means algorithm

Steps of K-means algorithm

A modified version of the K-means algorithm with a distance based on cluster symmetry

A Fast PTAS for k-Means Clustering

K-Means Clustering Algorithm - Cluster Analysis | Machine Learning Algorithm | Edureka

K-means Clustering Algorithm with Matlab Source code

K-means

Categorical K-means Clustering Algorithm