Loading in 2 Seconds...

Finding Spatial Equivalences Across Multiple RDF Datasets

Loading in 2 Seconds...

- 131 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Finding Spatial Equivalences Across Multiple RDF Datasets' - tamarr

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

Finding Spatial Equivalences Across Multiple RDF Datasets

Juan Salas, Andreas Harth

Outline

Motivation

NeoGeo Vocabularies

Geospatial Datasets

Integration Challenges

Finding Geometric Equivalences

Conclusion

Motivation

- Geodata is becoming increasingly relevant.
- Location-based services
- Mobile applications
- Ever increasing amount of sensor data (phones, satelites)
- Different sources.
- Many formats:
- GML, KML, Shapefile, GPX, WKT, RDF?…

Applications require integrated access to geodata.

NeoGeo Vocabularies

- Geometry Vocabulary– http://geovocab.org/geometry
- Representation of georeferenced geometric shapes.
- Spatial Ontology– http://geovocab.org/spatial
- Representation and reasoning on topological relations based on the Region Connection Calculus (RCC).

Geospatial Datasets

- GADM-RDF– http://gadm.geovocab.org
- RDF representation of the administrative regions of the GADM project: http://gadm.org
- NUTS-RDF– http://nuts.geovocab.org
- RDF representation of Eurostat\'s NUTS nomenclature.

They serve as:

- New geospatial information on the Semantic Web.
- Bridges between already published spatial datasets.
- Proof-of-concept platforms.

Integration Challenges

- Vocabularies – http://geovocab.org/doc/survey.html
- Survey of several well-known Linked Data datasets (Ordnance Survey, GeoLinkedData.es, LinkedGeoData.org, GeoNames, DBpedia).
- Identified properties and classes mapped to the NeoGeo vocabularies published at GeoVocab.org
- Instances
- Finding equivalences between regions across multiple datasets at the geometry level.

Finding Geometric Equivalences

Geometric shapes will not be vertex by vertex equivalent.

A sensible criterion for finding geometric equivalences is needed.

- NUTS-RDF and GADM-RDF have different:
- Sampling values
- Scales
- Starting points
- Rounding effects

1. Retrieve sample data

- The algorithm requires:
- WGS-84 coordinate reference system.
- Plate Carrée projection:

X = longitude

Y = latitude

- Coordinates are treated as Cartesian.
- Distorts all parameters (area, shape, distance, direction).
- Geometric shapes are equally distorted on both datasets.
- Local reprojections are avoided (e.g. UTM).
- Units will be presented in centesimal degrees.

2. Similarity threshold function

The Hausdorff Distance provides a measure of similarity between geometric shapes.

Can be intuitively defined as

the largest distance between

the closest points of two

geometric shapes.

2. Similarity threshold function

Smaller regions need a lower Hausdorff Distance threshold than larger regions.

2. Similarity threshold function

We calculate the midpoint value between the Hausdorff Distances for a correct guess and the lowest wrong guess.

2. Similarity threshold function

We perform regression on the midpoint values to obtain the Hausdorff Distance threshold function.

Poor Geospatial Information

Sometimes location is approximated as a single point.

Can lead to false assertions while calculating containment relations.

<http://dbpedia.org/resource/Germany> geo:lat 52.516666;

geo:long 13.383333 .

<http://nuts.geovocab.org/id/DE30_geometry> rdf:type ngeo:Polygon .

Germany is not contained in Berlin.

Other properties must be considered to calculate containment relations (e.g. rdf:type).

Other spatial relations (e.g. spatial:EQ) cannot be calculated.

Optimizations

The cost of calculating the Hausdorff distance depends on the amount of vertices.

The Ramer-Douglas-Peucker algorithm allows to simplify geometric shapes, using an arbitrary maximum separation.

Spatial Databases

- The algorithm works also well with spatial databases (e.g. PostgreSQL / PostGIS):

SELECT g.gadm_id, n.nuts_id

FROM nuts n

INNER JOIN gadm g ON (n.geometry && g.geometry)

WHERE

n.shape_area BETWEEN (g.shape_area * 0.9)

AND (g.shape_area * 1.1)

AND ST_HausdorffDistance(

ST_SimplifyPreserveTopology(n.geometry, 0.5),

ST_SimplifyPreserveTopology(g.geometry, 0.5)

) < g.max_hausdorff_dist;

Evaluation

GADM 2_13988

Leicestershire

NUTS UKF2

Leicestershire, Rutland and Northamptonshire

- Not every NUTS region matches a GADM region.
- Many NUTS regions represent parts or aggregations of GADM administrative boundaries.
- 1,671 NUTS regions => 965 matches & 13 false positives.

Conclusion

- NeoGeo vocabularies:
- Survey and mappings to other vocabularies.
- NUTS-RDF and GADM-RDF datasets:
- GADM-RDF links to DBpedia, UK Ordnance Survey and NUTS-RDF.
- Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox).
- Work on spatial similarity metrics:
- Promising results

Future Work

- NeoGeo vocabularies.
- Temporal context.
- Datasets:
- More Earth and space science data.
- Add more instance mappings.
- Spatial similarity:
- Improve precision.
- Develop tools to support the mapping process.
- More experiments:
- Querying of integrated data and reasoning.

Acknowledgements

European Commission\'s Seventh

Framework ProgrammeFP7/2007-2013

(PlanetData, Grant 257641)

Download Presentation

Connecting to Server..