1 / 22

GEOINFO 2006

GEOINFO 2006. Utilização da biblioteca TerraLib para algoritmos de agrupamento em Sistemas de Informações Geográficas. Use of the TerraLib library for clustering algorithms in Geographic Information Systems. Mauricio P. Guidini Carlos H. C. Ribeiro. Supervisor. Nov 2006.

Download Presentation

GEOINFO 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GEOINFO 2006 Utilização da biblioteca TerraLib para algoritmos de agrupamento em Sistemas de Informações Geográficas Use of the TerraLib library for clustering algorithms in Geographic Information Systems Mauricio P. Guidini Carlos H. C. Ribeiro Supervisor Nov 2006

  2. “... 3000 unregistered flights, with origin and destiny unkown by authorities, invaded the Brazilian airspace in the first ten months of this year. The Air Force calculates that about 30% of these flights were related to drug dealing ... Translated from note from 25/10/2004

  3. Data Mining in GIS Objetive To present the integration of a Data Mining algorithm (k-means) to TerraLib/TerraView, forming a Geographic Information System for Unknown Air Traffic analysis (GisTAD).

  4. Data Mining in GIS • Summary • Data Mining • Clustering Algorithms • Air Traffic • K-means Implementation • Results • Aplication

  5. Data Mining in GIS Data Mining Definition: “A non-trivial process of identification of valid, new, useful standards implicitly present in large volumes of data” Knowledge Discovery in Database (KDD) - Fayyad et al. (1996)

  6. Data Mining in GIS • How proceed DM? • KDD process

  7. Data Mining in GIS Clustering Algorithms The clustering process tries to grouping the data into groups that have highly similar features, helping the understanding of the information that they hold. A good clustering algorithm is characterized by the production of high level classes, where the intraclass similarity is high, and the interclass similarity is low. [Han & Kamber 2001]

  8. Data Mining in GIS • Major Categories • Partitioning – k-means, k-medoids • Hierarchical – CURE, BIRCH • Density-based – DBSCAN, OPTICS • Grid-based – STING • Model-based • Others • ANN – Kohonen network • Incremental - Leader

  9. Data Mining in GIS • Air Traffic Movement of aircraft, national or foreign, that fly over national territory. • Unkown Air Traffic To unidentified airplanes (flight plan), two lines of action can be taken[Bernabeu 2004]: • Intercept; or • Generate an Unkown Air Traffic Report

  10. Data Mining in GIS Traffic Representation • Line segments • Latitude (decimal degrees) • Longitude (decimal degrees) • Distance (miles) • Heading Restrictions • Acceptable deviations

  11. Data Mining in GIS K-means algorithm Precondition: set max deviation values to coordinates, distance and route Begin: K=0 While criterion condition not satisfied (deviation in clusters) Increase K Arbitrarily choose K centers (among data objects) While centers change (k-means) (re)assign routes in cluster based on weights update centers values end movement intergroups deviation in groups ok Save results End

  12. Data Mining in GIS Distance Measure • Minimize deviations • Improve cluster quality and

  13. Data Mining in GIS GIS Integration • TerraLib • TerraView • k-means

  14. Data Mining in GIS • Data preparation • 8000 records • looking for information (what?) • Search space restrictions

  15. Data Mining in GIS • Numeric Tests • to 500 records • GisTAD Tests • 319 records • 73 groups • Aprox. time = 40 sec.

  16. TerraView

  17. TerraView

  18. Data Mining in GIS Applications • Air Operations • Improper use of air space

  19. Data Mining in GIS Conclusion Considering the problem proposed, the k-means algorithm is applicable, and returned a good set of clusters. However, the number of records that must be clustered can make the application of the algorithm very time consuming.

  20. Future Work Other partitioning algorithms should be implemented, to verify which one is the most efficient for the problem in analysis, considering any size of records to be clustered. The algorithms to be tested are: • Kohonen neural network; • Leader algorithm.

More Related