1 / 15

Project Presentation

Project Presentation. Arpan Maheshwari Y7082 ,CSE arpanm@iitk.ac.in. Supervisor : Prof. Amitav Mukerjee Madan M Dabbeeru. Unsupervised Clustering Algorithms : A Comparative Study. Clustering :.

tiger
Download Presentation

Project Presentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Project Presentation ArpanMaheshwari Y7082,CSE arpanm@iitk.ac.in Supervisor: Prof. AmitavMukerjee Madan M Dabbeeru

  2. Unsupervised Clustering Algorithms : A Comparative Study

  3. Clustering: • Organising a collection of k-dimensional vectors into groups whose members share similar features in some way. • To reduce large amount of data by categorizing in smaller set of similar items. • Clustering is different from classification.

  4. Elements of Clustering : • Cluster : ordered list of objects sharing some similarities. • Distance Between Two Clusters : Implementation Dependent;e.g. Minkowski Metric • Similarity : function SIMILAR(Di , Dj) ; 0 : no agreement 1 : perfect agreement • Threshold : lowest possible input value of similarity required to join two objects in a cluster.

  5. Possible Applications: • Marketing • Biology & Medical Sciences • Libraries • Insurance • City Planning • WWW

  6. Growing Neural Gas • Proposed by Bernd Fritzke • Parametres are constant in time • Incremental • Adaptive • Competitive Hebbian Learning

  7. Parametres in GNG: • e_b : Learning rate of winner node • e_n : Learning rate of neighbours • lambda: when new node will be inserted • alpha : error decrement of winner nodes upon insertion of new node • beta : error decrement of all nodes

  8. Algorithm: • Initialise a set A to contain two nodes randomly chosen according to probability distribution p(ξ). • Generate an input signal ξ according to p(ξ). • Determine the winner node s1 and second nearest node s2 such that s1,s2 belong to A. • Create an edge between s1 & s2 (if not exist).Set its age to 0. • Increase error of s1 by distance between ξ & s1. • Move s1 and its neighbors towards input signal by e_w and e_n of difference between the coordinates. • Increment age of all edges emanating from s1. • Delete all edges with age >= max_age .Delete nodes with no edges. • If no. of input signals generated so far is a multiple of λ, insert a new node ,r. a)Find node with largest error ,q and neighbor of q with largest error ,f . b)Assign r the mean position of q and f and errorr = (errorq + errorf)/2 c)errorq -= α * errorq & errorf -= α* errorf d)add r in A. • Decrease error of all nodes by β *errori.

  9. Demo of GNG Reference:http://homepages.feis.herts.ac.uk/~nngroup/software.php

  10. DBSCAN : Density Based Spatial Clustering of Application with Noise • Proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and XiaoveiXui in 1996. • Finds clusters starting from estimated density . • Two parametres : epsilon(eps ) and minimum points minPts. • eps can be estimated.

  11. Algorithm :Reference:slides by Francesco SatiniPhd StudentIMT

  12. Comparing GNG & DBSCAN • Time Complexity • Capability of tackling high dimensional data • Perfomance • Number of initial parametres • Perfomance with moving data

  13. Data to be used • Mainly design data

  14. References: • Jim Holmstrm :Master Thesis,Growing Neural Gas-Experiments with GNG-GNG with Utility and Supervised GNG • M Ester, HP Kriegel, J Sander, X Xu : A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise - Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, 1996 • Competitive learning:http://homepages.feis.herts.ac.uk/~nngroup/software.php • www.utdallas.edu/~lkhan/Spring2008G/DBSCAN.ppt • B. Fritzke. :A growing neural gas network learns topologies. • Jose Alfredo F. Costa and Ricardo S. Oliveira :Cluster Analysis using Growing Neural Gas and Graph Partitioning

More Related