1 / 14

Clustering

Clustering. Definition. Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.

dusty
Download Presentation

Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering

  2. Definition • Clustering is “the process of organizing objects into groups whose members are similar in some way”. • A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.

  3. Pengklusteranmerupakanpengelompokan record, pengamatan, ataumemperhatikandanmembentukkelasobjek-objek yang memilikikemiripan. • Beberapaalgoritmapengelompokkandiantaranyaadalah EM dan Fuzzy C-Means

  4. Clustering Main Features • Clustering – a data mining technique • Usage: • Statistical Data Analysis • Machine Learning • Data Mining • Pattern Recognition • Image Analysis • Bioinformatics

  5. How many clusters? Six Clusters Two Clusters Four Clusters Notion of a Cluster can be Ambiguous

  6. Distance based method • In this case we easily identify the 4 clusters into which the data can be divided; the similarity criterion is distance: two or more objects belong to the same cluster if they are “close” according to a given distance. This is called distance-based clustering.

  7. Limitations of K-means: Non-globular Shapes Original Points K-means (2 Clusters)

  8. Limitations of K-means: Differing Sizes K-means (3 Clusters) Original Points

  9. Types of Clustering • Hierarchical • Finding new clusters using previously found ones • Partitional • Finding all clusters at once

  10. A Partitional Clustering Partitional Clustering Original Points

  11. Hierarchical Clustering Traditional Hierarchical Clustering Traditional Dendrogram Non-traditional Hierarchical Clustering Non-traditional Dendrogram

  12. AlgoritmaPengelompokan K-Means Langkah-langkahalgoritma K-Means: • Tentukanberapakelompok yang akandibuatsebanyak k kelompok. • Secarasembarangpilih k buahcatatan yang adasebagaipusat-pusatkeompokawal. • Setiapcatatanakanditentukanpusatkelompokterdekatnya. • Perbaruipusat-pusatkelompok. • Pusatkelompok yang terdekatpadasetiapcatatanakanditentukan, danseterusnyasampainilairasiotidakmembesarlagi.

  13. RumusJarakduatitik: Between Cluster Variation (BCV): BCV=d(m1,m2)+d(m1,3)+d(m2,m3) Dalamhalini, d(mi,j) menyatakanjarak mikemj Within Cluster Variation (WCV): WCV=(jarakpusattiap cluster yang paling minimum)2

More Related