1 / 18

Swarming Agents for Discovering Clusters in Spatial Data

Swarming Agents for Discovering Clusters in Spatial Data. G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it. Second International Symposium on Parallel and Distributed Computing Ljubljana, Slovenia · 13-14 October 2003. Sommario. Introduction Swarm intelligence

gilead
Download Presentation

Swarming Agents for Discovering Clusters in Spatial Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Swarming Agents for Discovering Clusters in Spatial Data G. Folino, A. Forestiero, G. Spezzano {folino,forestiero,spezzano}@icar.cnr.it Second International Symposium on Parallel and Distributed Computing Ljubljana, Slovenia · 13-14 October 2003

  2. Sommario • Introduction • Swarm intelligence • Flocking algorithm • Clustering and spatial datasets • Sparrow-SNN • Experimental results • Conclusions and Future Works

  3. Swarm Intelligence • Swarm Intelligence (SI) is the property of a system whereby the collective behaviors of (unsophisticated) agents interacting locally with their environment cause coherent functional global patterns to emerge. • A swarm has the following interesting properties: • Distributed, without central control • Ability to change the environment • Stigmergy (indirect communication via interaction with environment) • Fault tolerance • Adaptivity and self organization • Typical examples are ant colonies, flocks of birds, etc..

  4. Flocking algorithm • Typical example of emergent collective behavior. • No global control • Every agent has a limited visibility • The collective behavior emerges only by local interation, following these three simple rules: SeparationAlignment Cohesion

  5. Flocking algorithm • Agents could have an exploratory behavior: • Before, agents can search for a goal of particular interest • Then, the other flock members will be driven towards the goal in order to explore interesting area more carefully.

  6. Clustering • Clustering means to divide all objects in different groups (clusters) so that all members of a cluster are as similar as possible whereas the members of different clusters differ as much as possible from each other. • Spatial clustering should identify clusters of different dimensions, size, shape and density (particularly difficult).

  7. Clustering A different density spatial dataset

  8. SNN algorithm (1) • SNN is based on the famous Jarvis-Patrick algorithm. • identifies the K nearest-neighbors of each object (data point) in the dataset. • two objects i and j join the same cluster if: 1) i is one of the K nearest-neighbors of j; 2) j is one of the K nearest-neighbors of i; 3) i and j have at least Kmin of their K-nearest- neighbors in common; • where K and Kmin are used-defined parameters. For each pair of points i and j is defined a link with an associate weight. • The connectivity of a data point is computed as the sum of the weights associated to the outgoing links.

  9. SNN algorithm (2) • For every node (data point) calculate the connectivity; • Identify representative points by choosing the point that have high connectivity ( > core_threshold); • Identify noise points by choosing the points that have low connectivity ( < noise_threshold) and remove them; • Remove all links between points that have weight smaller than a threshold (merge_threshold) • Take connected components of points to form clusters, where every point in a cluster is either a representative point or is connected to a representative point.

  10. SPARROW-SNN • Sparrow-SNN combine the stochastic search of an adaptive flocking with SNN to discover clusters in spatial data. • It uses a variant of the flocking algorithm: • Before, agents can search for a goal of particular interest • Then, the other flock’s members will be driven towards the goal in order to explore interesting area more carefully. • We used Swarm, a software package for multi-agent simulation of complex systems, for the implementation of Sparrow-SNN.

  11. SPARROW-SNN Pseudo-code of the algorithm

  12. SPARROW-SNN • N agents are generated randomly in the search space. • When an agent falls on a data point not previously explored computes the connectivity. • Using connectivity, agents take different colors: conn > core_threshold -> mycolor = red noise_threshold < conn <= core_threshold -> mycolor = green 0 < conn < noise_threshold -> mycolor = yellow conn = 0 -> mycolor = white • Agents can indicate a representative point (red), noise (yellow), border point (green), or obstacle (white). • Red and white agents will stop signaling to the others the interesting and desert regions.

  13. SPARROW-SNN • Yellow and green agents will move following the modified rules of the flock (with repulsion from white agents and attraction towards red agents. • Besides, yellow agents move quickly (not interesting zones) whereas green agents move slowly. • red agents (placed on a representative point) will run the merge procedure so that it will include, in the final cluster, the representative point discovered together to the points that share with it a significant (greater that Pmin) number of neighbors.

  14. Experimental results (datasets)

  15. Experimental results (clusters found)

  16. Experimental results(random search vs Sparrow–SNN) a) GEORGE b) North-East

  17. Experimental results (scalability)

  18. Conclusions and Future Works • Sparrow-SNN is able to discover cluster of arbitrary shape, size and density in spatial data. • Performs well approximate clustering. • is naturally distributed, fault tolerant and scalable. • We are working on implementing a new version of Sparrow using Anthill, a peer-to-peer multi agent system based on JXTA.

More Related