1 / 25

A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging

A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging. Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Cheng-Ru Lin Ming-Syan Chen. Outline. Motivation Objective Introduction Preliminaries Cohesion-Base Self-Merging Algorithm

Download Presentation

A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Cheng-Ru Lin Ming-Syan Chen

  2. Outline • Motivation • Objective • Introduction • Preliminaries • Cohesion-Base Self-Merging Algorithm • Performance Studies • Conclusion • Personal opinion

  3. Motivation • The dissimilarity measured between two clusters are vulnerable to outliers, and removing the outliers precisely is yet another difficult task.

  4. Objective • We propose a new similarity measurement, referred to as “cohesion”, to measure the inter-cluster distances.

  5. Introduction • Hierarchical Clustering algorithms. • Good clustering quality. • Partitional clustering algorithms. • Good execution time and space requirement. • Hybrid clustering algorithms. • combin the features of partitional and hierarchical clustering methods

  6. Introduction

  7. Preliminaries • Hierarchical Clustering Algorithms. • Hierarchical Clustering Algorithm. • Single-link and Complete-link. • Algorithm CURE.

  8. Preliminaries • Partitional Clustering Algorithms. • The K-means algorithm. • Algorithm CLARA and CLARANS.

  9. Preliminaries • Hybrid Clustering Algorithms. • Phase1:Partition. • Phase2:Merge. • Algorithm BIRCH.

  10. Cohesion-Based Self-Merging Algorithm • We propose a new similarity measurement, namely cohesion, based on the joinability of a data point to another cluster.

  11. Cohesion-Based Self-Merging Algorithm

  12. Cohesion-Based Self-Merging Algorithm • Definition 1: • Given a cluster Cl consisting of n data points, p1,p2,…,pn, the radius r of Cl is defined as

  13. Cohesion-Based Self-Merging Algorithm • Definition 2: • Given a data point of a cluster and another cluster , the joinability of to is defined as

  14. Cohesion-Based Self-Merging Algorithm • Definition 3: • The cohesion of two clusters and is defined as

  15. Cohesion-Based Self-Merging Algorithm

  16. Cohesion-Based Self-Merging Algorithm • Algorithm CSM • Input: • The input data set, n. • The number of subclusters, m. • The desired number of clusters, k. • Output: • The hierarchical structure of the k clusters.

  17. Cohesion-Based Self-Merging Algorithm

  18. Cohesion-Based Self-Merging Algorithm

  19. Performance Studies • Experiment 1:Clustering Quality of Algorithm CSM.

  20. Performance Studies

  21. Performance Studies

  22. Performance Studies • Experiment 2:Efficiency of Algorithm CSM.

  23. Performance Studies

  24. Conclusion • Algorithm CSM is able to not only resist outliers but also lead to similar clustering results as algorithm CURE while incurring a much shorter execution time complexity.

  25. Personal Opinion • This paper has some examples can help us understand. • Cohesion : a good method to resist outliers. • Weakness : the number of subclusters, m? the desired number of clusters, k?

More Related