1 / 26

geoXwalk:- Developing a Gazetteer Service and Server for UK Academia

This project aims to develop a geo-spatial gazetteer service to enhance geographic searching and indexing of resources within the JISC Information Environment for UK academia.

jeanettemay
Download Presentation

geoXwalk:- Developing a Gazetteer Service and Server for UK Academia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. geoXwalk:-Developing a Gazetteer Service and Server for UK Academia James S Reid Project Manager, Geo-data Services, EDINA 15th EDINBURGH 3-DAY EVENT, 8th -10th MAY 2003

  2. Context - EDINA • a JISC National Data Centre, 1995 - • hosted by Edinburgh University Data Library, 1984 - • mission...to enhance productivity of research, learning and teaching in UK higher & further education • major provider within the JISC Information Environment • range of bibliographic resources • multimedia and image services • key geo-spatial data and geo-referenced information • UKBORDERS (1994 - ) boundary outlines & geo- reference database • Digimap (2000 -) online source of Ordnance Survey mapping • development projects - geoXwalk,Go-Geo!,e-MapScholar,Pathfinder... • strategic move toward interoperability & shared services role • adoption of appropriate standards (OGC,ISO)

  3. Context - The JISC Information Environment is… • variously stated as … • a national digital library... for UK higher and further education • a managed collection of quality assured resources • a distributed resource supporting learning and research in the UK • definitely heterogeneous • ‘words, numbers, pictures, sound’: including geo-spatial data • for use by researchers, students, teachers & support staff • based on an underlying functional model • simplified to: search -> obtain -> use -> publish • {discover/locate} {request/access} {view/copy/amend/combine} {publish} • now to have location-based searching • requiring geo-referencing of information objects

  4. Definitions • Gazetteer -A list of geographic features together with their associated spatial location • Digital Gazetteer -An electronic list of geographic features together with their associated spatial location An authority database of places (and features?) An ‘Active Gazetteer” • Digital Gazetteer Service -A network-addressable middle-ware server supporting geographic referencing and searching. A shared ‘terminology’ service.

  5. The problem • How to search ‘geographically’ ? given that : e.g. a postcode, a placename and an administrative area are all valid geographies and yet every information system cannot know about all the possible variations of what constitutes a ‘geography’! • Problem compounded by inconsistency of use even in the ‘standards’ e.g. placenames evolve, have alternative names • Long history in UK of boundary changes and changes in the geographies used to record things e.g. electoral ward boundary changes …

  6. There is underlying complexity, such as Multiple Geographies

  7. The vision • Make variations in defn. of ‘geography’ transparent • Provide a means to ‘crosswalk’ geographies i.e. translate one geography into another - hence the name • ‘Geographic agnosticism’ How? • A digital gazetteer that stores the different geographies and can implicitly resolve the relationships between them • Provision as a service to service other services

  8. Results of scoping study (Phase I) • Great deal of interest both within and without academia in concept of a digital gazetteer • Such a gazetteer would act as an important reference source • The gazetteer could also support machine to machine (m2m) interactions based on open protocols making it capable of becoming a ‘shared service’ • A suitably extensible model for the gazetteer was identified in the Alexandria Digital Library (ADL) model • A prototype demonstrator gazetteer should be developed based on the ADL model

  9. Phase II - Project Aims • To develop a ‘proof-of-concept’ geo-spatial gazetteer service suitable for extension to full service and illustrating: • The use of a gazetteer to enhance the geographic searching of one or more existing JISC services • The use of a gazetteer to assist in the semi-automatic geographic indexing of descriptions of JISC resources • Reference use through the provision of a command driven web-based interface, to show the types of queries that could be answered by a well-populated service • To consider how the gazetteer data could be made available as a shared service as part of the JISC Information Environment • Promote the possibilities of a fully functioning service

  10. geoXwalk - High Level Architecture (human interaction) (machine2machine interaction) Web client Information server Request via protocol (ADL, OGC, Z39.50) Request via protocol (ADL, OGC, Z39.50) The geoXwalk Server (Spatially enabled RDBMs)

  11. ADL Gazetteer Content Standard http://www.alexandria.ucsb.edu/gazetteer Geographic Feature ID Geographic Name Variant Geographic Name (R) Type of Geographic Feature (R) Other Classification Terms (R) Geographic Feature Code (R) Spatial Location (R) Street Address Related Feature (R) Description Geographic Feature Data (R) Link to Related Source of Information (R) Supplemental Note Metadata Information comprehensive description but with small set of core elements temporal aspects of names, footprints, relationships, … document source, spatial accuracy/scale of footprint does permit explicit relationship types!

  12. Gazetteer Database • Built on ADL Content Standard • Currently seeded with: • OS 1:50,000 digital Gazetteer • digital boundary data from UKBORDERS • data sourced from other OS products - Strategi, Meridian, 1:250,000 gazetteer • starting to add 3rd party data including Getty • Accuracy enhanced and metadata support • Current coverage: • Geographical - GB • Thematic - see below

  13. geoXwalk gazetteer - current thematic content (based on adapted ADL Feature Type Thesaurus)

  14. Protocols • ADL Query Protocol • lightweight, generic, relatively simple to implement • OGC Filter Encoding Specification • fuller, highly flexible, more complex • Z39.50 • pervasive in JISC IE, not specifically for geo-spatial data, lack of support

  15. 5 types of ADL query: • identifier-query identifier Return gazetteer entry identified by identifier Supported by geoXwalk • name-query operator text Returns gazetteer entries which match text under text-operator operator geoXwalk supports the mandatory equals operator and the optional match-pattern operator • footprint-query operator (polygon | box | identifier) Returns all gazetteer entries having a footprint that matches a query region according to spatial operator operator geoXwalk supports spatial operators within, contains and overlaps. Spatial extents can be specified by bounding box or identifier.

  16. ADL queries (contd) • class-query thesaurus term Returns all gazetteer entries which belong to the class (feature type). geoXwalk supports class queries (but currently does not return sub-classes by default as the ADL does) • relationship-query relationship identifier Returns all gazetteer entries having relationship relationship to a target gazetteer entry identified by identifier. geoXwalk does not support queries of this type because we do not hold explicit relationships between entities - they are derived implicitly from the geometries

  17. Information server Information server geoXwalk use cases Geo-parsing & indexing Searching (1) The geoXwalk Server e.g. • Where is Aberdour? • On what river is Dundee situated? • By what alternative names has York been known? • List me all places ending with ‘kirk’ Searching (2) Reference use

  18. Contact details • James.Reid@ed.ac.uk • For EDINA services contact: http://edina.ac.uk EDINA, Data Library, University of Edinburgh edina@ed.ac.uk or telephone +44 (0)131 650 3302 • For information on geoXwalk project: www.geoXwalk.ac.uk

  19. co-ordinates allow (near) co-located places to be co-identified. Using spatial proximity in an active gazetteer, the search can be widened: PlaceCounty/UA Liverpool Liverpool Bebbington Wirral Birkenhead Wirral Bootle Sefton New Brighton Wirral Seacombe Wirral Seaforth Wirral Waterloo Sefton … that means more & better hits …. !!! 15 Task: Find resource about 'Liverpool docks’ Search using a ‘traditional’ gazetteer might yield: 5 < >

  20. 340900,392300 - 347217, 397660 Knowsley geoXwalkServer Portal service BX003 Content Provider C Content Provider A Content Provider B geoXwalk use case : simple cross searching ‘Find resources for this postcode’(NB postcode often used to geo-reference survey data files) Post code: L34 0HS? Coordinate footprints Place names Parish names <

  21. geoXwalk use case :(semi) automatic indexing Need screen shot of parser here <

  22. Objectives (1) • Elicit the detailed requirements for a gazetteer service • Involve organisations outside UK academia in the development of a gazetteer service demonstrator. • Build a demonstrator focussing on near-contemporary data which should illustrate the following: • The use of a gazetteer to enhance the geographic searching of one or more existing JISC services • The use of a gazetteer to assist in the semi-automatic geographic indexing of descriptions of JISC resources • Reference use through the provision of a command driven web-based interface, to show the types of queries that could be answered by a well-populated service

  23. Objectives (2) • Investigate: • The issues involved in making the gazetteer a Z39.50 target • SOAP (web services) as an access mechanism • The utility and usability of the ADL Gazetteer Content Standard • Questions about performance and scalability of the service • The level of interest and commitment of interested parties outside tertiary education • The costs involved in populating the gazetteer, linking the data and quality assuring the data • Negotiate with data owners to use the key core datasets required to populate the gazetteer • Suggest ways in which data can be kept up-to date, and what kind of quality assurance on data input will be required • Carry out focus groups to assess the needs of the stakeholders for a full gazetteer service and promote the possibilities of a fully functioning service

  24. Deliverables • A functioning scalable demonstrator gazetteer service that has the potential to be fully integrated into the JISC Information Environment • A report on who the relevant stakeholders are and how the needs of the user group will be met • (An exit strategy - Phase III)

  25. Query: Archaeological sites within the city of York? 1 4 2 3 <

  26. <

More Related