slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Finding Spatial Equivalences Across Multiple RDF Datasets PowerPoint Presentation
Download Presentation
Finding Spatial Equivalences Across Multiple RDF Datasets

Loading in 2 Seconds...

play fullscreen
1 / 24

Finding Spatial Equivalences Across Multiple RDF Datasets - PowerPoint PPT Presentation

  • Uploaded on

Finding Spatial Equivalences Across Multiple RDF Datasets. Juan Salas, Andreas Harth. Outline. Motivation NeoGeo Vocabularies Geospatial Datasets Integration Challenges Finding Geometric E quivalences Conclusion. Motivation. Geodata is becoming increasingly relevant.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Finding Spatial Equivalences Across Multiple RDF Datasets' - tamarr

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript


NeoGeo Vocabularies

Geospatial Datasets

Integration Challenges

Finding Geometric Equivalences


  • Geodata is becoming increasingly relevant.
    • Location-based services
    • Mobile applications
    • Ever increasing amount of sensor data (phones, satelites)
  • Different sources.
  • Many formats:
    • GML, KML, Shapefile, GPX, WKT, RDF?…

Applications require integrated access to geodata.

neogeo vocabularies
NeoGeo Vocabularies
  • Geometry Vocabulary–
    • Representation of georeferenced geometric shapes.
  • Spatial Ontology–
    • Representation and reasoning on topological relations based on the Region Connection Calculus (RCC).
geospatial datasets
Geospatial Datasets
    • RDF representation of the administrative regions of the GADM project:
    • RDF representation of Eurostat's NUTS nomenclature.

They serve as:

    • New geospatial information on the Semantic Web.
    • Bridges between already published spatial datasets.
    • Proof-of-concept platforms.
integration challenges
Integration Challenges
  • Vocabularies –
    • Survey of several well-known Linked Data datasets (Ordnance Survey,,, GeoNames, DBpedia).
    • Identified properties and classes mapped to the NeoGeo vocabularies published at
  • Instances
    • Finding equivalences between regions across multiple datasets at the geometry level.
finding geometric equivalences
Finding Geometric Equivalences

Geometric shapes will not be vertex by vertex equivalent.

A sensible criterion for finding geometric equivalences is needed.

  • NUTS-RDF and GADM-RDF have different:
    • Sampling values
    • Scales
    • Starting points
    • Rounding effects
algorithm overview
Algorithm Overview

WGS-84, Plate Carrée projection


Hausdorff distance




1 retrieve sample data
1. Retrieve sample data
  • The algorithm requires:
    • WGS-84 coordinate reference system.
    • Plate Carrée projection:

X = longitude

Y = latitude

  • Coordinates are treated as Cartesian.
  • Distorts all parameters (area, shape, distance, direction).
    • Geometric shapes are equally distorted on both datasets.
  • Local reprojections are avoided (e.g. UTM).
  • Units will be presented in centesimal degrees.
2 similarity threshold function
2. Similarity threshold function

The Hausdorff Distance provides a measure of similarity between geometric shapes.

Can be intuitively defined as

the largest distance between

the closest points of two

geometric shapes.

2 similarity threshold function1
2. Similarity threshold function

Smaller regions need a lower Hausdorff Distance threshold than larger regions.

2 similarity threshold function2
2. Similarity threshold function

We calculate the midpoint value between the Hausdorff Distances for a correct guess and the lowest wrong guess.

2 similarity threshold function3
2. Similarity threshold function

We perform regression on the midpoint values to obtain the Hausdorff Distance threshold function.

poor geospatial information
Poor Geospatial Information

Sometimes location is approximated as a single point.

Can lead to false assertions while calculating containment relations.

<> geo:lat 52.516666;

geo:long 13.383333 .

<> rdf:type ngeo:Polygon .

Germany is not contained in Berlin.

Other properties must be considered to calculate containment relations (e.g. rdf:type).

Other spatial relations (e.g. spatial:EQ) cannot be calculated.


The cost of calculating the Hausdorff distance depends on the amount of vertices.

The Ramer-Douglas-Peucker algorithm allows to simplify geometric shapes, using an arbitrary maximum separation.

spatial databases
Spatial Databases
  • The algorithm works also well with spatial databases (e.g. PostgreSQL / PostGIS):

SELECT g.gadm_id, n.nuts_id

FROM nuts n

INNER JOIN gadm g ON (n.geometry && g.geometry)


n.shape_area BETWEEN (g.shape_area * 0.9)

AND (g.shape_area * 1.1)

AND ST_HausdorffDistance(

ST_SimplifyPreserveTopology(n.geometry, 0.5),

ST_SimplifyPreserveTopology(g.geometry, 0.5)

) < g.max_hausdorff_dist;


GADM 2_13988



Leicestershire, Rutland and Northamptonshire

  • Not every NUTS region matches a GADM region.
    • Many NUTS regions represent parts or aggregations of GADM administrative boundaries.
  • 1,671 NUTS regions => 965 matches & 13 false positives.
  • NeoGeo vocabularies:
    • Survey and mappings to other vocabularies.
  • NUTS-RDF and GADM-RDF datasets:
    • GADM-RDF links to DBpedia, UK Ordnance Survey and NUTS-RDF.
    • Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox).
  • Work on spatial similarity metrics:
    • Promising results
future work
Future Work
  • NeoGeo vocabularies.
    • Temporal context.
  • Datasets:
    • More Earth and space science data.
    • Add more instance mappings.
  • Spatial similarity:
    • Improve precision.
    • Develop tools to support the mapping process.
  • More experiments:
    • Querying of integrated data and reasoning.

European Commission's Seventh

Framework ProgrammeFP7/2007-2013

(PlanetData, Grant 257641)