1 / 69

Recent Trends in Fuzzy Clustering: From Data to Knowledge

Recent Trends in Fuzzy Clustering: From Data to Knowledge. Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland. pedrycz@ee.ualberta.ca. Shenyang, August 2009.

Download Presentation

Recent Trends in Fuzzy Clustering: From Data to Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recent Trends in Fuzzy Clustering:From Data to Knowledge Witold Pedrycz Department of Electrical & Computer Engineering University of Alberta, Edmonton, Canada and Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland pedrycz@ee.ualberta.ca Shenyang, August 2009

  2. Agenda Introduction: clustering, information granulation and paradigm shift Key challenges in clustering Fuzzy objective-based clustering Knowledge-based augmentation of fuzzy clustering Collaborative fuzzy clustering Concluding comments

  3. Clustering • Areas of research and applications: • Data analysis • Modeling • Structure determination • Google Scholar -2, 190,000 hits for “clustering” • (as of August 6, 2009)

  4. Clustering as a conceptual and algorithmic framework of information granulation Data information granules (clusters) abstraction of data Formalism of: set theory (K-Means) fuzzy sets (FCM) rough sets shadowed sets

  5. Main categories of clustering Graph-oriented and hierarchical (single linkage, complete linkage, average linkage..) Objective function-based clustering Diversity of formalisms and optimization tools (e.g., methods of Evolutionary Computing)

  6. Key challenges of clustering Data-driven methods Selection of distance function (geometry of clusters) Number of clusters Quality of clustering results

  7. The dichotomy and the shift of paradigm supervised learning unsupervised learning

  8. Fuzzy C-Means (FCM)

  9. Fuzzy Clustering: Fuzzy C-Means (FCM) Given data x1, x2, …, xN, determine its structure by forming a collection of information granules – fuzzy sets Objective function Minimize Q; structure in data (partition matrix and prototypes)

  10. Fuzzy Clustering: Fuzzy C-Means (FCM) Vi– prototypes U- partition matrix

  11. FCM – optimization Minimize subject to (a) prototypes (b) partition matrix

  12. Optimization - details Partition matrix – the use of Lagrange multipliers dik= ||xk-vi||2 l –Lagrange multiplier

  13. Optimization – partition matrix (1)

  14. Optimization- prototypes (2) Euclidean distance Gradient of Q with respect to vs

  15. Fuzzy C-Means (FCM): An overview

  16. Geometry of information granules n=1 m =1.2 m =2.0 m =3.5

  17. Domain Knowledge: Category of knowledge-oriented guidance Partially labeled data: some data are provided with labels (classes) Proximity knowledge: some pairs of data are quantified in terms of their proximity (closeness) Viewpoints: some structural information is provided Context-based guidance: clustering realized in a certain context specified with regard to some attribute

  18. Clustering with domain knowledge(Knowledge-based clustering)

  19. Context-based fuzzy clustering

  20. Context-based clustering To align the agenda of fuzzy clustering with the principles of fuzzy modeling, the following features are considered: Active role of the designer [customization of the model] The structural backbone of the model is fully reflective of relationships between information granules in the input and output space Clustering : construct clusters in input space X Context-based Clustering : construct clusters in input space X given some contextexpressed in output space Y

  21. Context-based clustering: Computing considerations structure structure context Data Data • computationally more efficient, • well-focused, • designer-guided clustering process

  22. Context-based clustering Context-based Clustering : construct clusters in input space X given some context expressed in output space Y Context – hint (piece of domain knowledge) provided by designer who actively impacts the development of the model

  23. Context-based clustering: Context design Context – hint (piece of domain knowledge) provided by designer who actively impacts the development of the model. As such, context is imposed by the designer at the beginning Realization of context Designer  focus  information granule (fuzzy set) (a) Designer, and (b) clustering of scalar data in output space Context – fuzzy set (set) formed in the output space

  24. Context-based clustering: Modeling Determine structure in input space given the output is high Determine structure in input space given the output is medium Determine structure in input space given the output is low Input space (data)

  25. Context-based clustering: examples Find a structure of customer data [clustering] Find a structure of customer data considering customers making weekly purchases in the range [$1,000 $3,000] Find a structure of customer data considering customers making weekly purchases at the level of around $ 2,500 Find a structure of customer data considering customers making significant weekly purchases who are young no context context context context (compound)

  26. Context-oriented FCM Data (xk, targetk), k=1,2,…,N Contexts: fuzzy sets W1, W2, …, Wp wjk = Wi(targetk) membership of j-th context for k-th data Context-driven partition matrix

  27. Context-oriented FCM: Optimization flow Objective function Subject to constraint U in U(Wj) Iterative adjustment of partition matrix and prototypes

  28. Fuzzy clustering with viewpoints

  29. Viewpoints: definition Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in casting an overall analysis in a required setting “external” , “reinforced” clusters

  30. Viewpoints: definition viewpoint (a,b) viewpoint (a,?)

  31. Viewpoints: definition Description of entity (concept) which is deemed essential in describing phenomenon (system) and helpful in casting an overall analysis in a required setting “external” , “reinforced” clusters

  32. Viewpoints: definition viewpoint (a,b) viewpoint (a,?)

  33. Viewpoints in fuzzy clustering B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data)

  34. Viewpoints in fuzzy clustering

  35. Viewpoints in fuzzy clustering B- Boolean matrix characterizing structure: viewpoints prototypes (induced by data)

  36. Viewpoints in fuzzy clustering

  37. Fuzzy clustering with partial supervision

  38. Labelled data and their description Characterization in terms of membership degrees: F = [fik] i=12,…,c , k=1,2, …., N and supervision indicator b = [bk], k=1,2,…, N

  39. Augmented objective function b > 0

  40. Fuzzy clustering with proximity hints

  41. Proximity hints Prox(k,l) Prox(s,t) Characterization in terms of proximity degrees: Prox(k, l), k, l=1,2, …., N and supervision indicator matrix B = [bkl], k, l=1,2,…, N

  42. Proximity measure • Properties of proximity: • Prox(k, k) =1 • Prox(k,l) = Prox(l,k) Proximity induced by partition matrix U:

  43. Augmented objective function b > 0

  44. Fuzzy clustering with collaboration mechanisms

  45. Two general development strategies SELECTION OF A “MEANINGFUL” SUBSET OF INFORMATION GRANULES

  46. Two general development strategies (1) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES (INFORMMATION GRANULES OF HIGHER TYPE) Information granules Type -2 Information granules Type -1

  47. Two general development strategies (2) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES AND THE USE OF VIEWPOINTS viewpoints Information granules Type -2 Information granules Type -1

  48. Two general development strategies (3) HIERARCHICAL DEVELOPMENT OF INFORMATION GRANULES – A MODE OF SUCCESSIVE CONSTRUCTION

  49. Information granules andtheir representatives Represent vk[ii] with the use of z1, z2, …, zc F Fii

  50. Representation of fuzzy sets:two performance measures Entropy measure Reconstruction criterion (error)

More Related