Finding Spatial Equivalences Across Multiple RDF Datasets
1 / 24

Finding Spatial Equivalences Across Multiple RDF Datasets - PowerPoint PPT Presentation

  • Uploaded on

Finding Spatial Equivalences Across Multiple RDF Datasets. Juan Salas, Andreas Harth. Outline. Motivation NeoGeo Vocabularies Geospatial Datasets Integration Challenges Finding Geometric E quivalences Conclusion. Motivation. Geodata is becoming increasingly relevant.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Finding Spatial Equivalences Across Multiple RDF Datasets' - tamarr

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript



NeoGeo Vocabularies

Geospatial Datasets

Integration Challenges

Finding Geometric Equivalences



  • Geodata is becoming increasingly relevant.

    • Location-based services

    • Mobile applications

    • Ever increasing amount of sensor data (phones, satelites)

  • Different sources.

  • Many formats:

    • GML, KML, Shapefile, GPX, WKT, RDF?…

      Applications require integrated access to geodata.

Neogeo vocabularies
NeoGeo Vocabularies

  • Geometry Vocabulary–

    • Representation of georeferenced geometric shapes.

  • Spatial Ontology–

    • Representation and reasoning on topological relations based on the Region Connection Calculus (RCC).

Geospatial datasets
Geospatial Datasets


    • RDF representation of the administrative regions of the GADM project:


    • RDF representation of Eurostat's NUTS nomenclature.

      They serve as:

    • New geospatial information on the Semantic Web.

    • Bridges between already published spatial datasets.

    • Proof-of-concept platforms.

Integration challenges
Integration Challenges

  • Vocabularies –

    • Survey of several well-known Linked Data datasets (Ordnance Survey,,, GeoNames, DBpedia).

    • Identified properties and classes mapped to the NeoGeo vocabularies published at

  • Instances

    • Finding equivalences between regions across multiple datasets at the geometry level.

Finding geometric equivalences
Finding Geometric Equivalences

Geometric shapes will not be vertex by vertex equivalent.

A sensible criterion for finding geometric equivalences is needed.

  • NUTS-RDF and GADM-RDF have different:

    • Sampling values

    • Scales

    • Starting points

    • Rounding effects

Algorithm overview
Algorithm Overview

WGS-84, Plate Carrée projection


Hausdorff distance




1 retrieve sample data
1. Retrieve sample data

  • The algorithm requires:

    • WGS-84 coordinate reference system.

    • Plate Carrée projection:

      X = longitude

      Y = latitude

  • Coordinates are treated as Cartesian.

  • Distorts all parameters (area, shape, distance, direction).

    • Geometric shapes are equally distorted on both datasets.

  • Local reprojections are avoided (e.g. UTM).

  • Units will be presented in centesimal degrees.

2 similarity threshold function
2. Similarity threshold function

The Hausdorff Distance provides a measure of similarity between geometric shapes.

Can be intuitively defined as

the largest distance between

the closest points of two

geometric shapes.

2 similarity threshold function1
2. Similarity threshold function

Smaller regions need a lower Hausdorff Distance threshold than larger regions.

2 similarity threshold function2
2. Similarity threshold function

We calculate the midpoint value between the Hausdorff Distances for a correct guess and the lowest wrong guess.

2 similarity threshold function3
2. Similarity threshold function

We perform regression on the midpoint values to obtain the Hausdorff Distance threshold function.

3 finding spatial equivalences
3. Finding spatial equivalences

Poor geospatial information
Poor Geospatial Information

Sometimes location is approximated as a single point.

Can lead to false assertions while calculating containment relations.

<> geo:lat 52.516666;

geo:long 13.383333 .

<> rdf:type ngeo:Polygon .

Germany is not contained in Berlin.

Other properties must be considered to calculate containment relations (e.g. rdf:type).

Other spatial relations (e.g. spatial:EQ) cannot be calculated.


The cost of calculating the Hausdorff distance depends on the amount of vertices.

The Ramer-Douglas-Peucker algorithm allows to simplify geometric shapes, using an arbitrary maximum separation.

Spatial databases
Spatial Databases

  • The algorithm works also well with spatial databases (e.g. PostgreSQL / PostGIS):

    SELECT g.gadm_id, n.nuts_id

    FROM nuts n

    INNER JOIN gadm g ON (n.geometry && g.geometry)


    n.shape_area BETWEEN (g.shape_area * 0.9)

    AND (g.shape_area * 1.1)

    AND ST_HausdorffDistance(

    ST_SimplifyPreserveTopology(n.geometry, 0.5),

    ST_SimplifyPreserveTopology(g.geometry, 0.5)

    ) < g.max_hausdorff_dist;


GADM 2_13988



Leicestershire, Rutland and Northamptonshire

  • Not every NUTS region matches a GADM region.

    • Many NUTS regions represent parts or aggregations of GADM administrative boundaries.

  • 1,671 NUTS regions => 965 matches & 13 false positives.


  • NeoGeo vocabularies:

    • Survey and mappings to other vocabularies.

  • NUTS-RDF and GADM-RDF datasets:

    • GADM-RDF links to DBpedia, UK Ordnance Survey and NUTS-RDF.

    • Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox).

  • Work on spatial similarity metrics:

    • Promising results

Future work
Future Work

  • NeoGeo vocabularies.

    • Temporal context.

  • Datasets:

    • More Earth and space science data.

    • Add more instance mappings.

  • Spatial similarity:

    • Improve precision.

    • Develop tools to support the mapping process.

  • More experiments:

    • Querying of integrated data and reasoning.


European Commission's Seventh

Framework ProgrammeFP7/2007-2013

(PlanetData, Grant 257641)