170 likes | 287 Views
Explore how georeferencing improves search engines by linking spatial and textual information. Learn about annotation techniques for geographic data and the potential of XML-based formats like GML and KML. Dive into the INSPIRE initiative and enhance data visualization through context-specific representations. Discover the history of georeferencing and its impact on information retrieval.
E N D
Georeferencing of Search Results based on Annoted Data and Geographic Information Systems by Giw Aalam MPI, Department 5, Databases and Information Systems
Motivation(1) • „Whatever occurs, occurs in space and time.“ (Wegener 2000) • Navigation • Path of a hurricane • Market surveys • Environmental dynamics • Increasing market for geospatial information
Motivation(2) • Most georeferencing we encounter daily is in form of placenames: • ~70% of text documents contain placename references (MetaCarta Inc. 2005) • 49,69% out of 5Mio. libraray catalog records of the University of California contain >1 place related subject headings (Petras 2004) Time and space reference potentially important to access documents and knowledge
Motivation(3) • Most text oriented search engines heavily depend on recognizing weighted keywords • Cannot find relevant results for queries like „tropical fruit“ and „Bodensee“; e.g. an article about apricots from Mainau isle • Requires appropiate knowledge base (Thesauri/Ontology, Gazzetters/geospatial information)
Objective • Show potentials & problems of the usage of geodata / georeferenced data in the framework of information retrieval • Develop a prototype for a search engine, based on widespread available data and tools (Wikipedia, GoogleEarth)
Georeferencing(1) • Translation between informal and formal representations of geographic locations • Informal reference e.g. used in discourse („Saarbrücken“, „the musicstore on Marktstraße“, …) • Formal representations are basis for mathematical calculations like distance, direction and spatial relationships in general (52° 31' N, 13° 25' O )
Annotation(1) • Congruent, common understanding of geographic references important for consistent annotation • What should be annotated? • What is relevant? • What aspects of geodata should be described? • How? • Context specific versus cross-context
Annotation(2) • XML-based formats play an increasing role in the framework of GIS (Geographic Information Systems) for annotation/description and data exchange • Structured • Extensible; XML Schema comprises a binding determination of an information model expressed in a document-instance • Possible to work with heterogeneous data
Annotation(3) • GML (Geography Markup Language) • Defined by the OGC (Open Geospatial Consortium) as a „standard“ format for modeling and exchange of spatial information • expected to be released as an international standard in 2007
Annotation(5) • KML (Keyhole Markup Language) • Delevoped by Keyhole Corp. for the „EarthViewer“-Tool • Keyhole has been taken over by Google Inc. in 2004 • KML now used in connection with GoogleEarth • Visualisation of georeferenced data • Description of geometric figures, pictures and locations • Define view/perspective Examples later…
Infrastructure(1) 3-level architecture as considered by EU-initiative INSPIRE (Infrastructure for Spatial Information in Europe)
Infrastructure(2) • Bottom level: (meta-)data sources • „Machinable“; read & interpret • Medium level: (value-added) services • Independent from specific databases • Top-Level: user-applications (GIS, browser, specialized services,…) Requirement for common interfaces!
Visualisation(1) • Context-specific representation / visualisation of results could effectively support the process of Data-Mining • Geospatial coherence often deducible by use of adequate visualisation • Conformance between mental model and cognitive style often better than in a simple table-view.
Visualisation(2) Example: Identification of a pump on „Broad Street“ as source of cholera epidemic; Dr. John Snow, 1854, London
References(1) • Wegener M, Fotheringham A., 2000, Spatial models and GIS: New Potential and New Models, London, Taylor & Francis • Petras V., 2004, Statistical Analysis of Geographic and Language Clues in the MARC Record. Technical report for the „Going Places in the Catalog: Improved Geographical Access“ project, University of California, http://metadata.sims.berkeley.edu/papers/Marcplaces.pdf • MetaCarta Inc. 2005, MetaCarta corporate brochure, http://metacarta.com/docs/Corporate_Brochure_06_05.pdf
References(2) • „INSPIRE Architecture and Standards Position Paper“, INSPIRE (Infrastructure for Spatial Information in Europe), European Commission, Joint Research Centre, 2002 http://inspire.jrc.it/reports/position_papers/inspire_ast_pp_v4_3_en.pdf • Düren U., „XML, GML, NAS“, Landesvermessungsanstalt NRW http://www.landesvermessungsamt.nrw.de/neues/veranstaltungen/seminare/images/Vortraege_LDS_Kurs_40004_06/Dueren_LVermA_NRW/LDS_40004_XML_GML_NAS.pdf • W. Riekert, P. Treffler, 2000, „Georeferenzierung als Mittel zur Erschließung von Fachinformationen in Internet und Intranet“, 14. Int. Symposium Informatik im Umweltschutz http://v.hdm-Stuttgart.de/~riekert/vortraege/00ui.pdf