1 / 45

Geographical Data Mining

Geographical Data Mining. Stan Openshaw Centre for Computational Geography University of Leeds. BUT. Ian Turton, CCG, Leeds University For the latest on Stan http://www.geog.leeds.ac.uk/staff/s.openshaw/latest.html. Why would we want to do this? . Geographical Data Explosion

brenna
Download Presentation

Geographical Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geographical Data Mining Stan Openshaw Centre for Computational Geography University of Leeds

  2. BUT Ian Turton, CCG, Leeds University For the latest on Stan http://www.geog.leeds.ac.uk/staff/s.openshaw/latest.html

  3. Why would we want to do this? • Geographical Data Explosion • Public imperative • Lack of geographically aware tools

  4. Mountains of Data

  5. Swamps of Data

  6. We know what you spend...

  7. …where you spend it...

  8. …who you talk to...

  9. …where you live... LS2 9JT What your neighbours are like

  10. ...Crime data and... • crime type • crime location • insurance data

  11. ...Health data • environmental data • socio-economic data • admissions data

  12. Geographical Hyperspace • Geography • x,y co-ordinates, postcodes • Time • days, hours, months • Attributes • place - pollution sources, soil type, distance to motorway • cases - type of disease, age, sex

  13. Data Mining

  14. Turning data into knowledge • How do these data sets fit together? • Is there anything important hidden in here? • Does geography make a difference?

  15. Datatype Nature of Data Interaction _________________________________________ 1. spatial data 2. time data 3. multiple attribute data 4. geography and time data 5. time and multiple attribute data 6. geography and multiple attribute data 7. geography, time, and multiple attribute data

  16. HISTORICALLY these effects have been hidden by research design BUT

  17. BUT

  18. The result is often data strangulation The patterns are being destroyed or damaged by the research design

  19. What is needed is a geographic data mining technology that works

  20. How can we do this? • Developing new smarter methods • Testing them • HPC is vital to this process • Disseminating them • Internet • Java

  21. Being SMART is not just a matter of methodology but also involves access, usability, relevancy, and result communication factors

  22. The complete novice should be able to perform some sophisticated geographical analysis and get some useful and understandable results on the same day the work started

  23. User Friendly Spatial Analysis • provides analysis that users need • simple to perform • highly automated making it fast and efficient • readily understood • results are self-evident and can be communicated to non-experts • safe and trustworthy

  24. What we did in this study • Comparison of techniques on the same data • Multiple techniques • GAM/K • GAM/K-T • MAPEX • GDM1/2 • FLOCK • Proprietary Data Mining Tools

  25. Study Area

  26. Stan’s Cases

  27. Chris’ cases

  28. How to search the geographic space • Exhaustively • GAM, GEM • Smartly • Genetic algorithm • mapex, gdm • Flocking • boids

  29. GAM & GEM

  30. Mapex & GDM

  31. FLOCK

  32. And the Attributes... • Exhaustively • GAM, GEM • Smartly • Genetic algorithm • mapex, gdm, boids

  33. GAM & GEM with time

  34. Rock D Rock A Rock B Rock C Geology Map

  35. 2 km railway buffer polygon

  36. Rock D Rock A 2 km Rock B Rock C Combined Geology and Railway Buffer Map

  37. Combinations of Attributes • If we have 8 attributes with 10 classes each • There are 3160 permutations of 2 classes from 80 compared with 24,040,016 if any 5 are used • Smart searches are essential • use GA to generate possible combinations of interest

  38. Proprietary Data Miners

  39. Results How to visualise them?

  40. Results • GAM/K • did very well • was not put off by time or attributes • GAM/KT • worked well • time clusters found • MAPEX / GDM/1 • worked well

  41. Results continued • FLOCK • worked very well • Data mining • didn’t work at all well out of the box • could have built a GAM inside them

  42. What next? • Build a harder data set for more tests • Re-run the analysis • Put it all on the web

  43. Thanks to • European Research Office of the US Army • ESRC grant R237260 for paying Ian’s salary. • ESRC/JISC for the Census data purchase. • OS for the bits of the maps they own.

  44. To find out more • Web based Multi-engine spatial analysis tools James Macgill, Openshaw and Turton • Session 1A - 14.00 Sunday • Smart Crime Pattern Analysis using GAM Ian Turton, Openshaw and Macgill • Session 7A - 10.40 Tuesday

  45. Contacts Email ian,stan,pgjm@geog.leeds.ac.uk check out smart pattern analysis on the web http://www.ccg.leeds.ac.uk/smart http://www.ccg.leeds.ac.uk/smart/hyper.doc Latest news on Stan http://www.geog.leeds.ac.uk/staff/s.openshaw/latest.html

More Related