1 / 18

Fast modified global k-means algorithm for incremental cluster construction

Fast modified global k-means algorithm for incremental cluster construction. Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation.

akiko
Download Presentation

Fast modified global k-means algorithm for incremental cluster construction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast modified global k-means algorithm for incremental cluster construction Adil M.Bagirov, JulienUgon, DeanWebb PR, 2011 Presented by Wen-Chung Liao 2011/01/05

  2. Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments

  3. Motivation • The global k-means algorithm and the modified global k-means algorithm are incremental clustering algorithms. • allow one to find global or a near global minimizer of the cluster (or error) function. • However, these algorithms are memory demanding • they require the storage of the affinity matrix . • Alternatively, this matrix can be computed at each iteration, however, this extends thecomputational time significantly.

  4. Objectives • A new version of the modified global k-means algorithm is proposed: • apply an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. • the best solutionis selected as a starting point for the next cluster center. • information gathered in previous iterations of the incremental algorithm to avoid computing the whole affinity matrix. • the triangle inequality for distances is used to avoid unnecessary computations

  5. Methodology • Modified global k-means algorithm [1] • Starts with the computation ofone cluster centerand attempts tooptimally add one new cluster center at each iteration. • An auxiliary cluster function • using k-1 cluster centers from the(k-1)-th iteration • to compute the starting point for the k-th center. • The k-means algorithm is applied starting from this point to find the k-partition of the dataset. • Fast modified global k-means algorithm • auxiliary cluster function to generate a set of starting points • the best solution is selected • avoid computing the whole affinity matrix

  6. Modified global k-means algorithm cluster function

  7. Modified global k-means algorithm : the solution to the(k-1)-partition problem Auxiliary cluster function: y S(y) x1 x3 x2

  8. Modified global k-means algorithm

  9. S1.0(y) x1 x3 x2 Fast modified global k-means algorithm u=0.2 u=1.0 y S0.2(y) x1 x3 x2

  10. Reduction of computational effort S(ai) ai aj x1

  11. Reduction of computational effort

  12. Computational complexity • The modified global k-means algorithm • O(mk2T+km2+kmt) • The fast modified global k-means algorithm • O(p(mk2T+km2+kmt)) (without complexity reduction schemes) • O(p(mk2T+km1 2+km1t)) (with complexity reduction schemes) T the number of iterations by Algorithm 2 tthenumberofiterationsbyAlgorithm1 m1the number of data points in the set P(u)∩A and m1<<m.

  13. k the number of clusters fopt the best known value of the cluster function × m E the error in %, α the number of Euclidean norm evaluations t the CPU time Numericalexperiments

  14. fFMGKM/fGKM fFMGKM/fMGKM α CPU time

  15. The Dunn’s validity index The Davies–Bouldin cluster validity measure • Show a similar pattern. • Generate similar cluster structures

  16. Conclusions • Developed a new version of the modified global k-means algorithm • Using the k-1 cluster centers from the previous iteration to solve the k-partition problem. • does not rely on the affinity matrix to compute the starting point • use more than one starting point to minimize the auxiliary function • Two schemes to reduce the amount of computational effort • no guarantee that it will converge to the global solution.

  17. Comments • Advantages • Schemes to avoid computational effort. • Shortages • Determine the set U is not easy. • Applications • clustering

More Related