370 likes | 558 Views
Towards Linked Stream Data. Oscar Corcho. Contents. The concept of Linked Stream Data (LSD) Main challenges addressed so far W3C SSN Ontology URI definition Supporting technology Some examples Challenges being addressed currently. Linked Stream Data.
E N D
TowardsLinked Stream Data Oscar Corcho
Contents • The concept of Linked Stream Data (LSD) • Main challenges addressed so far • W3C SSN Ontology • URI definition • Supporting technology • Some examples • Challenges being addressed currently
LinkedStream Data • A representation of stream data followingtheprinciples of LinkedData • Addingsemanticsallowsthesearch and explorationof sensor data withoutany prior knowledge of the data source • Usingtheprinciples of Linked Data facilitatestheintegration of stream data totheincreasingnumber of data collectionsthatformtheLinked Open Data cloud • Earlyreferences… • AmitSheth, CoryHenson, and SatyaSahoo, "Semantic Sensor Web," IEEE Internet Computing, July/August 2008, p. 78-83 • Sequeda J, Corcho O. LinkedStream Data: A Position Paper. Proceedingsof the 2nd International WorkshoponSemantic Sensor Networks, SSN 09 • Le-Phuoc D, Parreira JX, Hauswirth M. Challengesin LinkedStream Data Processing: A Position Paper. Proceedingsof the3rd International WorkshoponSemantic Sensor Networks, SSN 10
Motivation: Sensor Networks • Increasingavailability of cheap, robust, deployablesensors as ubiquitousinformationsources Source: Antonis Deligiannakis
Anexample: SmartCities Environmentalsensor nodes Parking sensor nodes Santander
Sensor Networks and Streaming Data • Streaming Data (t9, a1, a2, ... , an) (t8, a1, a2, ... , an) (t7, a1, a2, ... , an) ... ... (t1, a1, a2, ... , an) ... ... • Continuously appended data • Potentially infinite • Time-stamped tuples • Continuous queries • Latest used in queries • Time and tuple-based windows Window [t7 - t9] Streaming Data • Sensor Networks • Cheap, Noisy, Unreliable (depends) • Low computational, power resources, storage • Distributed query execution • Routing, Optimization Query EnablingSemanticIntegration of Streaming Data Sources
Notonlyenvironmentalsensors, butmanyothers… Weather Sensors Sensor Dataset GPS Sensors Satellite Sensors Camera Sensors Source: H Patni, C Henson, A Sheth
A semanticperspectiveonthe Sensor Web • Sensor data querying and (pre-)processing • Data heterogeneity • Data quality • New inferencecapabilitiesrequiredtodealwith sensor information • Sensor data modelrepresentation and management • For data publication, integration and discovery • Bridgingbetween sensor data and ontologicalrepresentationsfor data integration • Ontologies: Observations and measurements, time series, etc. • Eventmodels • Userinteractionwith sensor data
Contents • The concept of Linked Stream Data (LSD) • Main challenges addressed so far • W3C SSN Ontology • URI definition • Supporting technology • Some examples • Challenges being addressed currently
LSD: Challenges/Topics • A model/vocabulary/ontologyaccordingtowhichwe can produce RDF data streams • URI definition • Basedonsensors/devicesoronobservations? • Howshouldweencode time in them? • Technologytosupportlinkedstream data • SPARQL extensionstohandle time and tuplewindows • In many cases, alsospatio-temporal extensions • Tightlycoupledwith Data Stream Management Systems • e.g., MonetDBstreamingextension • Transformation/Characterisationtools
SSN ontology • Several efforts since approx. 2005 • State of the art on sensor network ontologies in the report below • In 2009, a W3C incubator group was started, which has just finished • Lots of good people there • Final report: http://www.w3.org/2005/Incubator/ssn/XGR-ssn-20110628/ • Ontology: http://purl.oclc.org/NET/ssnx/ssn • A good number of internal and external references to SSN Ontology • http://www.w3.org/2005/Incubator/ssn/wiki/Tagged_Bibliography • SSN Ontology paper submitted to Journal of Web Semantics
Overview of the SSN ontology modules Deployment System OperatingRestriction Process Device PlatformSite Data Skeleton MeasuringCapability ConstraintBlock
Overview of the SSN ontologies deploymentProcesPart only Deployment System OperatingRestriction hasSubsystem only, some hasSurvivalRange only SurvivalRange DeploymentRelatedProcess hasDeployment only System OperatingRange Deployment hasOperatingRange only deployedSystem only deployedOnPlatform only Process hasInput only inDeployment only Device Input Device Process onPlatform only PlatformSite Output Platform hasOutput only, some attachedSystem only Data Skeleton implements some isProducedBy some Sensor Sensing hasValue some SensorOutput sensingMethodUsed only detects only SensingDevice observes only ObservationValue SensorInput isProxyFor only Property isPropertyOf some includesEvent some observedProperty only observationResult only hasProperty only, some observedBy only Observation FeatureOfInterest featureOfInterest only MeasuringCapability ConstraintBlock hasMeasurementCapability only forProperty only inCondition only inCondition only MeasurementCapability Condition
Sensor and environmental properties Skeleton Property MeasuringCapability Communication hasMeasurementProperty only MeasurementCapability MeasurementProperty Accuracy Resolution Selectivity Frequency Precision Latency DetectionLimit Drift ResponseTime Sensitivity MeasurementRange OperatingRestriction EnergyRestriction hasOperatingProperty only OperatingProperty OperatingRange EnvironmentalOperatingProperty MaintenanceSchedule OperatingPowerRange hasSurvivalProperty only SurvivalRange SurvivalProperty EnvironmentalSurvivalProperty SystemLifetime BatteryLifetime
URI Definition • Debate betweenbeingobservation-centricor sensor-centric • Observation-centricseemsto be thewinner • Encoding of time
SPARQL-STR SELECT ?waveheight FROM STREAM <www.ssg4env.eu/SensorReadings.srdf> [FROM NOW -10 MINUTES TO NOW STEP 1 MINUTE] WHERE { ?WaveObs a sea:WaveHeightObservation; sea:hasValue ?waveheight; } SELECT measuredFROM wavesamples [NOW -10 MIN] conceptmap-def WaveHeightMeasurement virtualStream <http://ssg4env.eu/Readings.srdf> uri-as concat('ssg4env:WaveSM_', wavesamples.sensorid,wavesamples.ts) attributemap-defhasValue operation constant has-columnwavesamples.measured dbrelationmap-def isProducedBy toConcept Sensor joins-via condition equals has-column sensors.sensorid has-columnwavesamples.sensorid conceptmap-def Sensor uri-as concat('ssg4env:Sensor_',sensors.sensorid) attributemap-def hasSensorid operation constant has-column sensors.sensorid Query translation SNEEql SPARQLStream Query Processing Stream-to-Ontology mappings Client Sensor Network Data translation [tuples] [triples] R2RML Mappings
SPARQL-STR SPARQLStream algebra(S1 S2 Sm) GSN Query translation q SNEEql, GSN API Sensor Network (S1) SPARQLStream (Og) Relational DB (S2) Query Evaluator Stream-to-Ontology Mappings (R2RML) Client Stream Engine (S3) RDF Store (Sm) Data translation [tuples] [triples] Ontology-based Streaming Data Access Service
Sensors, Mappings and Queries CreatingMappings ssn:observedProperty ssn:Observation ssn:Property http://swissex.ch/data# Wan7/WindSpeed/Observation{timed} sweetSpeed:WindSpeed ssn:observationResult wan7 ssn:SensorOutput timed: datetime PK sp_wind: float http://swissex.ch/data# Wan7/ WindSpeed/ ObsOutput{timed} ssn:hasValue ssn:ObservationValue http://swissex.ch/data# Wan7/WindSpeed/ObsValue{timed} qudt:numericValue xsd:decimal sp_wind
Contents • The concept of Linked Stream Data (LSD) • Main challenges addressed so far • W3C SSN Ontology • URI definition • Supporting technology • Some examples • Challenges being addressed currently
Let’schecksomeexamples • Meteorological data in Spain: automaticweatherstations • http://aemet.linkeddata.es/ • Paperunder open review at theSemantic Web Journal • http://www.semantic-web-journal.net/content/transforming-meteorological-data-linked-data • Live sensors in Slovenia • http://sensors.ijs.si/
SwissEx • Global Sensor Networks, deployment for SwissEx. • Distributedenvironment: GSN Davos, GSN Zurich, etc. • In each site, a number of sensorsavailable • Each one withdifferentschema • Metadatastored in wiki • Federatedmetadata management: • Jeung H., Sarni, S., Paparrizos, I., Sathe, S., Aberer, K., Dawes, N., Papaioannus, T., Lehning, M.EffectiveMetadata Management in federatedSensor Networks. in SUTC, 2010 Sensor observations Sensormetadata
Gettingthingsdone • Transformed wiki metadata to SSN instances in RDF • Generated R2RML mappings for all sensors • Implementation of Ontology-basedquerying over GSN • Fronting GSN with SPARQL-Stream queries • Numbers: • 28 Deployments • Aprox. 50 sensors in eachdeployment • More than 1500 sensors • Live updates. Lowfrequency • Access to all metadata/not all data
SensorMetadata station location sensors model properties
Sensor Data: Observations Heterogeneity Integration
Pachube2RDF - Architecture • Converter • To convert pachube data into mySQL records • Pachube database • Schema that conforms to Pachube result • D2R Mappings • To map Pachube database with SSN ontology
Current Situation • Core components have been developed • Converter • Database conforms to Pachube result • D2R mappings • Pachube result is converted into RDF instances conforming to SSN ontology • Difficulties to extract from Pachube result • feature-of-interest • properties
Contents • The concept of Linked Stream Data (LSD) • Main challenges addressed so far • W3C SSN Ontology • URI definition • Supporting technology • Some examples • Challenges being addressed currently
LSD over HTTP • Use HTTP as access protocol for streams, as HTTP supports streaming of data • Linked Data Streams use RDF as data encoding, and HTTP as access protocol • Open HTTP connection, and then serve RDF triples ad infinitum Web Server Client HTTP GET 200 OK STREAM
Example • Source of stream http://events.play-project.eu/e1 • Data (stream of triples) e1:event a :avgTempEvent . e1:event :startTime "2011-01-29"^^xsd:date . e1:event :endTime "2011-01-31"^^xsd:date . loc:Nice :avgTemp [ rdf:value "25" ; :event e1:event ] . … Spec draft at http://km.aifb.kit.edu/sites/lodstream/
Pro’s and Con’s • Use of standard HTTP servers and clients for streams • Simple access using a web browser or other HTTP client • Simple publication using a CGI script or Servlet • Fits with Linked Data principles, and allows for reuse of tools and best practices (e.g., provenance tracking) • Linking via use of URIs in the RDF stream • Potential overhead in using HTTP and RDF (lower-level protocols and data formats might be more efficient) • Javier D. Fernández, Miguel A. Martínez-Prieto, Claudio Gutierrez, and Axel Polleres. Binary RDF Representation for Publication and Exchange (HDT), W3C Member Submission 30 March 2011.
TowardsLinked Stream Data Oscar Corcho