1 / 93

Spectral Clustering

Spectral Clustering. Jianping Fan Dept of CS UNC-Charlotte. http://webpages.uncc.edu/jfan/itcs4122.html. Inter-cluster distances are maximized. Intra-cluster distances are minimized. Key issues for Data Clustering. Objective Function. Similarity or distance function

roldan
Download Presentation

Spectral Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spectral Clustering Jianping Fan Dept of CS UNC-Charlotte http://webpages.uncc.edu/jfan/itcs4122.html

  2. Inter-cluster distances are maximized Intra-cluster distances are minimized Key issues for Data Clustering Objective Function • Similarity or distance function • Inter-cluster similarity or distance • Intra-cluster similarity or distance • Number of clusters K • Decision for data clustering

  3. Problems of K-means Summary of K-means • Locations of Centers • Number of Clusters K • Sensitive to Outliers • Data Manifolds (Shapes of Data Distributions) • Experiences Centers: random & density scan K: start from small K & separate iteratively; start from large K and merge sequentially Outliers:

  4. Inter-cluster distances are maximized Intra-cluster distances are minimized Problems of K-MEANs Distance Function Geometry Distance Optimization Step: Assignment Step:

  5. Problems of K-MEANs Similarity function cannot handle special data manifold effectively! Intra-cluster similarity and inter-cluster similarity are not optimized jointly or simultaneously! Pre-selected locations of cluster centers may not be acceptable!

  6. K-Means Clustering Expected Achieved Why K-Means fails?

  7. Why K-Means Clustering Fails? Expected Achieved Objective Function • Similarity or distance function • Inter-cluster similarity or distance • Intra-cluster similarity or distance • Number of clusters K • Decision for data clustering

  8. Why K-Means Clustering Fails? Achieved Expected Number of clusters K may not be an issue here Objective function?

  9. Why K-Means Clustering Fails? Expected Achieved Data Manifold: Relationship rather than distance Distance Function & Decision for Data Clustering

  10. Key issues for Data Clustering • Inter-cluster similarity or distance • Intra-cluster similarity or distance • Number of clusters K • Decision for data clustering • Similarity or distance function

  11. Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering • Cool implementations

  12. Dataset exhibits complex cluster shapes • K-means performs very poorly in this space due bias toward dense spherical clusters. In the embedded space given by two leading eigenvectors, clusters are trivial to separate. Spectral Clustering Example – 2 Spirals Relationship vs. Geometry Distance

  13. Spectral Clustering Relationship Objective Function • Similarity representation • Inter-cluster similarity • Intra-cluster similarity • Number of clusters K • Decision for clustering

  14. Graph-Based Similarity Representation---considering data manifold Geometry Distance vs. Relationship

  15. Spectral Clustering Example Why k-means fails? Geometry vs. Manifold

  16. Graph-Based Similarity Representation Distance vs. Relationship

  17. Graph-Based Similarity Representation Distance vs. Relationship

  18. Graph-Based Similarity Representation Distance vs. Relationship

  19. Graph-Based Similarity Representation Number of clusters matters

  20. Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering • Cool implementation

  21. Graph-based Representation of Data Similarity(Relationship)

  22. Similarity (Relationship) Graph-based Representation of Data Similarity(Relationship)

  23. Graph-based Representation of Data Relationship

  24. Manifold (Shape of Data Distribution)

  25. Graph-based Representation of Data Relationships Manifold

  26. Graph-based Representation of Data Relationships

  27. Graph-based Representation of Data Relationships How to generate such graph for data relationship representation?

  28. Data Graph Construction

  29. Graph-based Representation of Data Relationships

  30. Graph-based Representation of Data Relationships

  31. Graph-based Representation of Data Relationships

  32. Graph-based Representation of Data Relationships

  33. Graph Cut

  34. Lecture Outline • Motivation • Graph overview and construction • Spectral Clustering---considering intra-cluster similarity and inter-cluster similarity jointly! • Cool implementations

  35. Key issues for Spectral Clustering Objective Function • Relationship function for Graph construction • Inter-cluster similarity or distance • Intra-cluster similarity or distance • Number of clusters K • Decision for data clustering

  36. How to Do Graph Partitioning? Citation Group Identification

  37. How to Do Graph Partitioning? Social Group Identification

  38. How to Do Graph Partitioning? Hot Topic Detection

  39. Graph-based Representation of Data Relationships

  40. Intra-cluster similarity

  41. Spectral Clustering cut Intra-Cluster Similarity: Inter-Cluster Similarity:

  42. Spectral Clustering Graphcut Objective Function for Spectral Clustering 1. Maximize Intra-Cluster Similarity 2. Minimize Inter-Cluster Similarity

  43. Spectral Clustering Graphcut Objective Function for Spectral Clustering Min

  44. Spectral Clustering Graphcut Clustering via Graph Cut on weak connection points: Minimize inter-cluster similarity

  45. Inter-cluster similarity

  46. Inter-cluster similarity

More Related