1 / 3

Canopy Clustering Given a distance measure and two threshold distances T1>T2,

Canopy Clustering Given a distance measure and two threshold distances T1>T2, 1. Determine canopy centers - go through The list of input points to form a list of “clusterCenters”. If a point is within T2 of A a point in clusterCenters, then ignore it. If not, then append the point to

fox
Download Presentation

Canopy Clustering Given a distance measure and two threshold distances T1>T2,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Canopy Clustering Given a distance measure and two threshold distances T1>T2, 1. Determine canopy centers - go through The list of input points to form a list of “clusterCenters”. If a point is within T2 of A a point in clusterCenters, then ignore it. If not, then append the point to ClusterCenters. 2. Determine canopy membership – for each point in the input set, if the point is Within T1 of a cluster center, then the point is a member of the corresponding cluster

  2. Combine Canopy and kMeans or EM Only calculate distances for points that share a canopy with the centroid. (assign infinite distance to points outside the canopies containing the Centroid.

  3. Canopy Clustering with MR Given distance metric and tighter threshold T2 Mapper – Start with empty set of canopyCenters. For each x in inputData, if x is further than T2 from any member of canopyCenters, Then add x to canopyCenters and emit (1, x). Reducer – start with empty set of canopyCenters. Input = (key, iterator over mapper cluster centers). For x in iterator, if x is further than T2 from any member of canopyCenters, then add x to canopyCenters and emit(1,x). This results in a list of canopy centers to be used for determining canopy membership

More Related