KI2 - 7
This presentation is the property of its rightful owner.
Sponsored Links
1 / 42

Kunstmatige Intelligentie / RuG PowerPoint PPT Presentation


  • 137 Views
  • Uploaded on
  • Presentation posted in: General

KI2 - 7. Clustering Algorithms. Johan Everts. Kunstmatige Intelligentie / RuG. What is Clustering? .

Download Presentation

Kunstmatige Intelligentie / RuG

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


What is clustering

KI2 - 7

Clustering Algorithms

Johan Everts

Kunstmatige Intelligentie / RuG


What is clustering

What is Clustering?

Find K clusters (or a classification that consists of K clusters) so that the objects of one cluster are similar to each other whereas objects of different clusters are dissimilar. (Bacher 1996)


The goals of clustering

The Goals of Clustering

  • Determine the intrinsic grouping in a set of unlabeled data.

  • What constitutes a good clustering?

  • All clustering algorithms will produce clusters,

    regardless of whether the data contains them

  • There is no golden standard, depends on goal:

    • data reduction

    • “natural clusters”

    • “useful” clusters

    • outlier detection


Stages in clustering

Stages in clustering


Taxonomy of clustering approaches

Taxonomy of Clustering Approaches


Hierarchical clustering

Hierarchical Clustering

Agglomerative clustering treats each data point as a singleton cluster, and then successively merges clusters until all points have been merged into a single remaining cluster. Divisive clustering works the other way around.


Agglomerative clustering

Agglomerative Clustering

Single link

In single-link hierarchical clustering, we merge in each step the two clusters whose two closest members have the smallest distance.


Agglomerative clustering1

Agglomerative Clustering

Complete link

In complete-link hierarchical clustering, we merge in each step the two clusters whose merger has the smallest diameter.


Example single link ac

Example – Single Link AC


Example single link ac1

Example – Single Link AC


Example single link ac2

Example – Single Link AC


Example single link ac3

Example – Single Link AC


Example single link ac4

Example – Single Link AC


Example single link ac5

Example – Single Link AC


Example single link ac6

Example – Single Link AC


Example single link ac7

Example – Single Link AC


Example single link ac8

Example – Single Link AC


Example single link ac9

Example – Single Link AC


Example single link ac10

Example – Single Link AC


Taxonomy of clustering approaches1

Taxonomy of Clustering Approaches


Square error

Square error


K means

K-Means

  • Step 0: Start with a random partition into K clusters

  • Step 1: Generate a new partition by assigning each pattern to its closest cluster center

  • Step 2: Compute new cluster centers as the centroids of the clusters.

  • Step 3: Steps 1 and 2 are repeated until there is no change in the membership (also cluster centers remain the same)


K means1

K-Means


K means how many k s

K-Means – How many K’s ?


K means how many k s1

K-Means – How many K’s ?


Locating the knee

Locating the ‘knee’

The knee of a curve is defined as the point of maximum curvature.


Leader follower

Leader - Follower

  • Online

  • Specify threshold distance

  • Find the closest cluster center

    • Distance above threshold ? Create new cluster

    • Or else, add instance to cluster


Leader follower1

Leader - Follower

  • Find the closest cluster center

    • Distance above threshold ? Create new cluster

    • Or else, add instance to cluster


Leader follower2

Leader - Follower

  • Find the closest cluster center

    • Distance above threshold ? Create new cluster

    • Or else, add instance to cluster and update cluster center

Distance < Threshold


Leader follower3

Leader - Follower

  • Find the closest cluster center

    • Distance above threshold ? Create new cluster

    • Or else, add instance to cluster and update cluster center


Leader follower4

Leader - Follower

  • Find the closest cluster center

    • Distance above threshold ? Create new cluster

    • Or else, add instance to cluster and update cluster center

Distance > Threshold


Kohonen som s

Kohonen SOM’s

The Self-Organizing Map (SOM) is an unsupervised artificial neural network algorithm. It is a compromise between biological modeling and statistical data processing


Kohonen som s1

Kohonen SOM’s

  • Each weight is representative of a certain input.

  • Input patterns are shown to all neurons simultaneously.

  • Competitive learning: the neuron with the largest response is chosen.


Kohonen som s2

Kohonen SOM’s

  • Initialize weights

  • Repeat until convergence

    • Select next input pattern

    • Find Best Matching Unit

    • Update weights of winner and neighbours

    • Decrease learning rate & neighbourhood size

Learning rate & neighbourhood size


Kohonen som s3

Kohonen SOM’s

Distance related learning


Kohonen som s4

Kohonen SOM’s


Some nice illustrations

Some nice illustrations


Kohonen som s5

Kohonen SOM’s

  • Kohonen SOM Demo (from ai-junkie.com):

    mapping a 3D colorspace on a 2D Kohonen map


Performance analysis

Performance Analysis

  • K-Means

    • Depends a lot on a priori knowledge (K)

    • Very Stable

  • Leader Follower

    • Depends a lot on a priori knowledge (Threshold)

    • Faster but unstable


Performance analysis1

Performance Analysis

  • Self Organizing Map

    • Stability and Convergence Assured

      • Principle of self-ordering

    • Slow and many iterations needed for convergence

      • Computationally intensive


Conclusion

Conclusion

  • No Free Lunch theorema

    • Any elevated performance over one class, is exactly paid for in performance over another class

  • Ensemble clustering ?

    • Use SOM and Basic Leader Follower to identify clusters and then use k-mean clustering to refine.


Any questions

Any Questions ?

?


  • Login