160 likes | 266 Views
SEMAGROW Using a POWDER Triple Store for boosting the real-time performance of global agricultural data infrastructures. Pythagoras Karampiperis National Centre for Scientific Research “Demokritos”. KREAM 2013. Outline. Introduction / Problem Statement The SemaGrow Solution
E N D
SEMAGROWUsing a POWDER Triple Store for boosting the real-time performance of global agricultural data infrastructures Pythagoras Karampiperis National Centre for Scientific Research “Demokritos” KREAM 2013
Outline • Introduction / Problem Statement • The SemaGrow Solution • The POWDER W3C Recommendation • SemaGrow Architecture • The SemaGrow Stack • SemaGrow Maintenance Components KREAM 2013
Moving Forward with “Old” Technologies How Many? Is it feasible? BigData Problem! KREAM 2013
What Semantic Web can bring into the picture • One Data Access Point for the entire Data Cloud • Enabling Service-Data level agreements with Data providers • Application-level Vocabularies / Thesauri / Ontologies • Enabling different application facets for different communities of users over the SAME data pool • Going beyond existing Distributed Triple Store Implementations • Link Heterogeneous but Semantically Connected Data • Index Extremely Large Information Volumes (Peta Sizes) • Improve Information Retrieval response • Data (+Metadata) physically stored in Data Provider • No need for harvesting • Vocabularies / Thesauri / Ontologies of Data Provider choice • No need for aligning according to common schemas KREAM 2013
The SemaGrow Solution • Use POWDER to mass-annotate large-subspaces • Exploit naming convention regularities to compress the indexes used by the system • Partition triple patterns in the original query • Annotate each fragment with an ordered list of data sources most likely to contain relevant data • Distribute and transform the query fragments • Collect and align the results KREAM 2013
The POWDER W3C Recommendation • Exploits natural groupings of URIs to annotate all resources in a subset of the URI space • Regular expression based grouping • Allows properties and their values to be associated with an arbitrary number of subjects within a fully-defined semantic framework • POWDER Description Resources: http://www.w3.org/TR/powder-dr/ • POWDER Formal Semantics: http://www.w3.org/TR/powder-formal/ KREAM 2013
The SemaGrow Stack • Integrates the components in order to offer a single SPARQL endpoint that federates a number of heterogeneous data sources • Targets the federation of independently provided data sources KREAM 2013
SemaGrow Architecture Resource Discovery Query Decomposition Federated Endpoint Wrapper Data Summaries Endpoint KREAM 2013
Query Decomposition • Analyses SPARQL queries • Decides on the optimal way to create query fragments to be dispatched to sources’ endpoints • Components • Query Decomposition: Suggestions of possible decompositions • Selector: Evaluates these suggestions based on information and predictions from the Resource Discovery Component KREAM 2013
Resource Discovery • Provides an annotated list of candidate data sources that (possibly) hold triples matching a query pattern • Sources are annotated with additional information • Schema-level metadata • Instance-level metadata • Predicted Response Volume • Run-time information about current source load • Semantic proximity of source and query schemas KREAM 2013
Data Summaries Endpoint • Serves metadata about the schema and instances of the various federated data stores • Receives entity URIs • Returns the repositories where these entities are located (either at the schema or instance level) • Returns ontology alignment knowledge regarding entity equivalence between different sources KREAM 2013
Federated Endpoint Wrapper • Manages the communication with external data sources federated by the SemaGrow Stack • Query Manager • Call Query Transformation Service when necessary • Forwarding query fragments to the Query Results Merger • Collecting and forwarding run-time statistics to the Resource Discovery Component • Query Results Merger • Pay-as-you-go behaviour • Provides first approximations and iteratively refines them if more computational resources are warranted by the reactivity parameters • Query Transformation Service • Accesses the Schema Mappings Repository • Rewrites query fragments from the original query schema to that of the data source that will be used for the fragment • Rewrites query results from the source schema to the query schema KREAM 2013
Maintenance Components • Authoring Tool • Visual tool for assisting data providers • Construction of POWDER statements • Provenance and cataloguing metadata • Ontology Alignment Tool • Semi-automatic (human intervention) alignment of Semantic Vocabularies used by data providers and consumers • Content Classification and Ontology Evolution • Refine coarsely annotated data to a level of detail where they can be more accurately aligned with other schemas within the federation KREAM 2013
Project info • SemaGrow: Data intensive techniques to boost the real-time performance of global agricultural data infrastructures • FP7-ICT-2011.4.4 (Intelligent Information Management) KREAM 2013
Thank You! Dr. Pythagoras P. Karampiperis (pythk@iit.demokritos.gr) Institute of Informatics & Telecommunications (IIT), NCSR “Demokritos” (NCSR) www.semagrow.eu KREAM 2013