1 / 20

Chapter 4 Clustering

Chapter 4 Clustering . Neural Network Based Approaches. Figure 5.12 Single artificial neuron with three inputs. Table 5.4 All possible input patterns from Figure 5.12. Finding the output value. Finding the Output Value. w 1=2 w 2=-4 w 3=1. Table 5.5

hagop
Download Presentation

Chapter 4 Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4Clustering

  2. Neural Network Based Approaches Figure 5.12 Single artificial neuron with three inputs Table 5.4 All possible input patterns from Figure 5.12

  3. Finding the output value

  4. Finding the Output Value w1=2 w2=-4 w3=1 Table 5.5 Input patterns for the neural network

  5. Kohonen Self-Organizing Map • Invented by TeuvoKohonen, a professor of the Academy of Finland, in 1979. • Provides a data visualization technique which helps to understand high dimensional data by reducing the dimensions of data to a map • SOM also represents clustering concept by grouping similar data together.

  6. Kohonen Self-Organizing Map SOM reduces data dimensions and displays similarities among data.

  7. Kohonen SOM • The self-organizing map describes a mapping from a higher dimensional input space to a lower dimensional map space. • The procedure for placing a vector from data space onto the map is to first find the node with the closest weight vector to the vector taken from data space. • Once the closest node is located it is assigned the values from the vector taken from the data space.

  8. How the algorithm works • Intialize the weights • Get best matching uint • Scale neighbors • Determining neighbors • Learning

  9. Training a SOM Training occurs in several steps and over many iterations: • Each node's weights are initialized. • A vector is chosen at random from the set of training data and presented to the lattice. • Every node is examined to calculate which one's weights are most like the input vector. The winning node is commonly known as the Best Matching Unit (BMU). • The radius of the neighbourhood of the BMU is now calculated. This is a value that starts large, typically set to the 'radius' of the lattice,  but diminishes each time-step. Any nodes found within this radius are deemed to be inside the BMU's neighbourhood. • Each neighbouring node's (the nodes found in step 4) weights are adjusted to make them more like the input vector. The closer a node is to the BMU, the more its weights get altered. • Repeat step 2 for N iterations.

  10. Components of a SOM 1. Data Colorsare represented in three dimensions (red, blue, and green.) The idea of the self-organizing maps is to project the n-dimensional data (here it would be colors and would be 3 dimensions) into something that be better understood visually (in this case it would be a 2 dimensional image map).

  11. Components of a SOM 2. Weight Vectors • Each weight vector has two components. • The first part of a weight vector is its data (R,G,B). • The second part of a weight vector is its natural location (x,y)

  12. The problem • With these two components (the dataand weight vectors), how can one order the weight vectors in such a way that they will represent the similarities of the sample vectors?

  13. Algorithm Initialize Map For t from 0 to 1 Randomly select a sample Get best matching unit Scale neighbours Increase t a small amount End for

  14. Scaling & Neighborhood function Equation (5.1) wi(t+1)= wi(t) +hck(t) [x(t)-wi(t)]

  15. Numerical Demonstration Figure 5.14 SOM Table 5.6 Values of the weights for SOM

  16. Numerical Demonstration

  17. Numerical Demonstration

  18. Numerical Demonstration

  19. Text Clustering • An increasingly popular technique to group similar documents • A search engine may retrieve documents and present as groups • For example, a keyword “Cricket” may retrieve two clusters: • Related to cricket sports • Related to cricket insect

More Related