slide1
Download
Skip this Video
Download Presentation
Finding Spatial Equivalences Across Multiple RDF Datasets

Loading in 2 Seconds...

play fullscreen
1 / 24

Finding Spatial Equivalences Across Multiple RDF Datasets - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

Finding Spatial Equivalences Across Multiple RDF Datasets. Juan Salas, Andreas Harth. Outline. Motivation NeoGeo Vocabularies Geospatial Datasets Integration Challenges Finding Geometric E quivalences Conclusion. Motivation. Geodata is becoming increasingly relevant.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Finding Spatial Equivalences Across Multiple RDF Datasets' - tamarr


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline

Motivation

NeoGeo Vocabularies

Geospatial Datasets

Integration Challenges

Finding Geometric Equivalences

Conclusion

motivation
Motivation
  • Geodata is becoming increasingly relevant.
    • Location-based services
    • Mobile applications
    • Ever increasing amount of sensor data (phones, satelites)
  • Different sources.
  • Many formats:
    • GML, KML, Shapefile, GPX, WKT, RDF?…

Applications require integrated access to geodata.

neogeo vocabularies
NeoGeo Vocabularies
  • Geometry Vocabulary– http://geovocab.org/geometry
    • Representation of georeferenced geometric shapes.
  • Spatial Ontology– http://geovocab.org/spatial
    • Representation and reasoning on topological relations based on the Region Connection Calculus (RCC).
geospatial datasets
Geospatial Datasets
  • GADM-RDF– http://gadm.geovocab.org
    • RDF representation of the administrative regions of the GADM project: http://gadm.org
  • NUTS-RDF– http://nuts.geovocab.org
    • RDF representation of Eurostat\'s NUTS nomenclature.

They serve as:

    • New geospatial information on the Semantic Web.
    • Bridges between already published spatial datasets.
    • Proof-of-concept platforms.
integration challenges
Integration Challenges
  • Vocabularies – http://geovocab.org/doc/survey.html
    • Survey of several well-known Linked Data datasets (Ordnance Survey, GeoLinkedData.es, LinkedGeoData.org, GeoNames, DBpedia).
    • Identified properties and classes mapped to the NeoGeo vocabularies published at GeoVocab.org
  • Instances
    • Finding equivalences between regions across multiple datasets at the geometry level.
finding geometric equivalences
Finding Geometric Equivalences

Geometric shapes will not be vertex by vertex equivalent.

A sensible criterion for finding geometric equivalences is needed.

  • NUTS-RDF and GADM-RDF have different:
    • Sampling values
    • Scales
    • Starting points
    • Rounding effects
algorithm overview
Algorithm Overview

WGS-84, Plate Carrée projection

1

Hausdorff distance

1

spatial:EQ

*

1 retrieve sample data
1. Retrieve sample data
  • The algorithm requires:
    • WGS-84 coordinate reference system.
    • Plate Carrée projection:

X = longitude

Y = latitude

  • Coordinates are treated as Cartesian.
  • Distorts all parameters (area, shape, distance, direction).
    • Geometric shapes are equally distorted on both datasets.
  • Local reprojections are avoided (e.g. UTM).
  • Units will be presented in centesimal degrees.
2 similarity threshold function
2. Similarity threshold function

The Hausdorff Distance provides a measure of similarity between geometric shapes.

Can be intuitively defined as

the largest distance between

the closest points of two

geometric shapes.

2 similarity threshold function1
2. Similarity threshold function

Smaller regions need a lower Hausdorff Distance threshold than larger regions.

2 similarity threshold function2
2. Similarity threshold function

We calculate the midpoint value between the Hausdorff Distances for a correct guess and the lowest wrong guess.

2 similarity threshold function3
2. Similarity threshold function

We perform regression on the midpoint values to obtain the Hausdorff Distance threshold function.

poor geospatial information
Poor Geospatial Information

Sometimes location is approximated as a single point.

Can lead to false assertions while calculating containment relations.

<http://dbpedia.org/resource/Germany> geo:lat 52.516666;

geo:long 13.383333 .

<http://nuts.geovocab.org/id/DE30_geometry> rdf:type ngeo:Polygon .

Germany is not contained in Berlin.

Other properties must be considered to calculate containment relations (e.g. rdf:type).

Other spatial relations (e.g. spatial:EQ) cannot be calculated.

optimizations
Optimizations

The cost of calculating the Hausdorff distance depends on the amount of vertices.

The Ramer-Douglas-Peucker algorithm allows to simplify geometric shapes, using an arbitrary maximum separation.

spatial databases
Spatial Databases
  • The algorithm works also well with spatial databases (e.g. PostgreSQL / PostGIS):

SELECT g.gadm_id, n.nuts_id

FROM nuts n

INNER JOIN gadm g ON (n.geometry && g.geometry)

WHERE

n.shape_area BETWEEN (g.shape_area * 0.9)

AND (g.shape_area * 1.1)

AND ST_HausdorffDistance(

ST_SimplifyPreserveTopology(n.geometry, 0.5),

ST_SimplifyPreserveTopology(g.geometry, 0.5)

) < g.max_hausdorff_dist;

evaluation
Evaluation

GADM 2_13988

Leicestershire

NUTS UKF2

Leicestershire, Rutland and Northamptonshire

  • Not every NUTS region matches a GADM region.
    • Many NUTS regions represent parts or aggregations of GADM administrative boundaries.
  • 1,671 NUTS regions => 965 matches & 13 false positives.
conclusion
Conclusion
  • NeoGeo vocabularies:
    • Survey and mappings to other vocabularies.
  • NUTS-RDF and GADM-RDF datasets:
    • GADM-RDF links to DBpedia, UK Ordnance Survey and NUTS-RDF.
    • Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox).
  • Work on spatial similarity metrics:
    • Promising results
future work
Future Work
  • NeoGeo vocabularies.
    • Temporal context.
  • Datasets:
    • More Earth and space science data.
    • Add more instance mappings.
  • Spatial similarity:
    • Improve precision.
    • Develop tools to support the mapping process.
  • More experiments:
    • Querying of integrated data and reasoning.
acknowledgements
Acknowledgements

European Commission\'s Seventh

Framework ProgrammeFP7/2007-2013

(PlanetData, Grant 257641)

ad