1 / 19

Probabilistic Latent Semantic Analysis as a Potential Method for Integrating Spatial Data Concepts

Probabilistic Latent Semantic Analysis as a Potential Method for Integrating Spatial Data Concepts. R.A. Wadsworth 1 , A.J. Comber 2 , P.F. Fisher 2. Centre for Ecology and Hydrology, Lancaster, UK Dept of Geography, Leicester University, UK.

milo
Download Presentation

Probabilistic Latent Semantic Analysis as a Potential Method for Integrating Spatial Data Concepts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Latent Semantic Analysis as a Potential Method for Integrating Spatial Data Concepts R.A. Wadsworth1, A.J. Comber2, P.F. Fisher2 • Centre for Ecology and Hydrology, Lancaster, UK • Dept of Geography, Leicester University, UK

  2. We want to understand how the environment is changing. But, natural resource inventories constantly develop new base-lines. Therefore we want some way to know how similar two categories are so we can decide whether inconsistencies are change or error. Motivation

  3. First we just asked people (domain experts) “are ‘a’ and ‘b’, similar or dis-similar or you’re not sure?” But, the domain expert has to make lots of choices, sometimes domain experts aren’t available, you don’t know why they think concepts are similar (or not), etc. so ... (Very) simple text mining – the more words used in common in two categories the more similar they are. Earlier approaches

  4. In the proceedings we use land-cover categories, but, We’re all here because of Andrew ... So, what does his writing tell us about the underlying concepts behind his work? Case Study

  5. Used the English language abstracts from the papers provided on his web site. This is a biased sample, do the other papers contain concepts not covered by the English language work? Do they contain collaborations I’ve missed? However, just want to illustrate the process ... Case Study – the data

  6. Case Study – the data Red dots – collaborators Blue squares – papers in this analysis

  7. Text Mining Andrew’s Abstracts “Object orientated modelling in GIS” “Processes in cadstre” “A formal model of correctness in cadstre” “Surveying mapping and LIS education in the USA” “Surveying education for the future” “Expert systems for GIS”

  8. Guessing what the axis mean

  9. Guessing what the axis mean

  10. If we knew what the underlying (hidden, latent) concepts are, we might be able to understand why two categories are considered to be similar. Why latent analysis?

  11. It is a “generative model” Assumes: documents describe themes and words are associated with themes We observe the frequency of words in documents P(d,w) = P(d)∑zєZP(w|z)P(z|d) Therefore, we try and model what latent variables (z’s) exist. Probabilistic Latent Semantic Analysis

  12. In practice similar to clustering but ... “Documents are not assigned to clusters, they are characterized by a specific mixture of factors with weights P(z|d). These mixing weights offer more modelling power and are conceptually very different from posterior probabilities in clustering models and (unsupervised) naive Bayes models.” Thomas Hofmann 1999 Probabilistic Latent Semantic Analysis

  13. PLSA – iterative, stochastic

  14. Nine Latent Themes in Andrew’s Work Cadastral systems, metadata and cartography? “B” “C” “A”

  15. Latent Themes in Andrew’s work Education and Technology? “D” “E”

  16. Latent Themes in Andrew’s work Decisions and Directions? “G” “F”

  17. Latent Themes in Andrew’s work Data? “I” “H”

  18. Simple text mining allows you to relate categories to each other, but, not always easy to say why. PLSA gives some indication of the underlying (fundamental?) themes, but, how stable or useful are the results ...? Conclusions

  19. Thank you

More Related