Finding Spatial Equivalences Across Multiple RDF Datasets
Download
1 / 24

Finding Spatial Equivalences Across Multiple RDF Datasets - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

Finding Spatial Equivalences Across Multiple RDF Datasets. Juan Salas, Andreas Harth. Outline. Motivation NeoGeo Vocabularies Geospatial Datasets Integration Challenges Finding Geometric E quivalences Conclusion. Motivation. Geodata is becoming increasingly relevant.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Finding Spatial Equivalences Across Multiple RDF Datasets' - tamarr


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Outline
Outline

Motivation

NeoGeo Vocabularies

Geospatial Datasets

Integration Challenges

Finding Geometric Equivalences

Conclusion


Motivation
Motivation

  • Geodata is becoming increasingly relevant.

    • Location-based services

    • Mobile applications

    • Ever increasing amount of sensor data (phones, satelites)

  • Different sources.

  • Many formats:

    • GML, KML, Shapefile, GPX, WKT, RDF?…

      Applications require integrated access to geodata.


Neogeo vocabularies
NeoGeo Vocabularies

  • Geometry Vocabulary– http://geovocab.org/geometry

    • Representation of georeferenced geometric shapes.

  • Spatial Ontology– http://geovocab.org/spatial

    • Representation and reasoning on topological relations based on the Region Connection Calculus (RCC).


Geospatial datasets
Geospatial Datasets

  • GADM-RDF– http://gadm.geovocab.org

    • RDF representation of the administrative regions of the GADM project: http://gadm.org

  • NUTS-RDF– http://nuts.geovocab.org

    • RDF representation of Eurostat's NUTS nomenclature.

      They serve as:

    • New geospatial information on the Semantic Web.

    • Bridges between already published spatial datasets.

    • Proof-of-concept platforms.


Integration challenges
Integration Challenges

  • Vocabularies – http://geovocab.org/doc/survey.html

    • Survey of several well-known Linked Data datasets (Ordnance Survey, GeoLinkedData.es, LinkedGeoData.org, GeoNames, DBpedia).

    • Identified properties and classes mapped to the NeoGeo vocabularies published at GeoVocab.org

  • Instances

    • Finding equivalences between regions across multiple datasets at the geometry level.



Finding geometric equivalences
Finding Geometric Equivalences

Geometric shapes will not be vertex by vertex equivalent.

A sensible criterion for finding geometric equivalences is needed.

  • NUTS-RDF and GADM-RDF have different:

    • Sampling values

    • Scales

    • Starting points

    • Rounding effects


Algorithm overview
Algorithm Overview

WGS-84, Plate Carrée projection

1

Hausdorff distance

1

spatial:EQ

*


1 retrieve sample data
1. Retrieve sample data

  • The algorithm requires:

    • WGS-84 coordinate reference system.

    • Plate Carrée projection:

      X = longitude

      Y = latitude

  • Coordinates are treated as Cartesian.

  • Distorts all parameters (area, shape, distance, direction).

    • Geometric shapes are equally distorted on both datasets.

  • Local reprojections are avoided (e.g. UTM).

  • Units will be presented in centesimal degrees.


2 similarity threshold function
2. Similarity threshold function

The Hausdorff Distance provides a measure of similarity between geometric shapes.

Can be intuitively defined as

the largest distance between

the closest points of two

geometric shapes.


2 similarity threshold function1
2. Similarity threshold function

Smaller regions need a lower Hausdorff Distance threshold than larger regions.


2 similarity threshold function2
2. Similarity threshold function

We calculate the midpoint value between the Hausdorff Distances for a correct guess and the lowest wrong guess.


2 similarity threshold function3
2. Similarity threshold function

We perform regression on the midpoint values to obtain the Hausdorff Distance threshold function.


3 finding spatial equivalences
3. Finding spatial equivalences


Poor geospatial information
Poor Geospatial Information

Sometimes location is approximated as a single point.

Can lead to false assertions while calculating containment relations.

<http://dbpedia.org/resource/Germany> geo:lat 52.516666;

geo:long 13.383333 .

<http://nuts.geovocab.org/id/DE30_geometry> rdf:type ngeo:Polygon .

Germany is not contained in Berlin.

Other properties must be considered to calculate containment relations (e.g. rdf:type).

Other spatial relations (e.g. spatial:EQ) cannot be calculated.


Optimizations
Optimizations

The cost of calculating the Hausdorff distance depends on the amount of vertices.

The Ramer-Douglas-Peucker algorithm allows to simplify geometric shapes, using an arbitrary maximum separation.



Spatial databases
Spatial Databases

  • The algorithm works also well with spatial databases (e.g. PostgreSQL / PostGIS):

    SELECT g.gadm_id, n.nuts_id

    FROM nuts n

    INNER JOIN gadm g ON (n.geometry && g.geometry)

    WHERE

    n.shape_area BETWEEN (g.shape_area * 0.9)

    AND (g.shape_area * 1.1)

    AND ST_HausdorffDistance(

    ST_SimplifyPreserveTopology(n.geometry, 0.5),

    ST_SimplifyPreserveTopology(g.geometry, 0.5)

    ) < g.max_hausdorff_dist;


Evaluation
Evaluation

GADM 2_13988

Leicestershire

NUTS UKF2

Leicestershire, Rutland and Northamptonshire

  • Not every NUTS region matches a GADM region.

    • Many NUTS regions represent parts or aggregations of GADM administrative boundaries.

  • 1,671 NUTS regions => 965 matches & 13 false positives.



Conclusion
Conclusion

  • NeoGeo vocabularies:

    • Survey and mappings to other vocabularies.

  • NUTS-RDF and GADM-RDF datasets:

    • GADM-RDF links to DBpedia, UK Ordnance Survey and NUTS-RDF.

    • Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox).

  • Work on spatial similarity metrics:

    • Promising results


Future work
Future Work

  • NeoGeo vocabularies.

    • Temporal context.

  • Datasets:

    • More Earth and space science data.

    • Add more instance mappings.

  • Spatial similarity:

    • Improve precision.

    • Develop tools to support the mapping process.

  • More experiments:

    • Querying of integrated data and reasoning.


Acknowledgements
Acknowledgements

European Commission's Seventh

Framework ProgrammeFP7/2007-2013

(PlanetData, Grant 257641)


ad