1 / 17

Toward Scalable Reasoning over Annotated RDF Data Using MapReduce

Toward Scalable Reasoning over Annotated RDF Data Using MapReduce. Chang Liu 1 , Guilin Qi 2 1 Shanghai Jiao Tong University 2 Southeast University , China. Motivation. More interests to represent additional information on top of RDF Time, uncertainty, trust, and provenance

chinara
Download Presentation

Toward Scalable Reasoning over Annotated RDF Data Using MapReduce

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward Scalable Reasoning over Annotated RDF Data Using MapReduce Chang Liu1, Guilin Qi2 1Shanghai Jiao Tong University 2Southeast University, China

  2. Motivation • More interests to represent additional information on top of RDF • Time, uncertainty, trust, and provenance • => Annotated RDF • Large amount of data • YAGO2 • Problem: Large Scale Reasoning

  3. Motivation (cont’d) • Recent work on scalable reasoning using MapReduce • WebPIE (ISWC ‘09, ESWC ‘10) • Fuzzy pD* (ISWC ‘11) • Our idea • Large scale annoated RDF reasoner using MapReduce

  4. Background: Annotated RDF • Syntax: • Deductive rules: • Subproperty, Subclass, Domain, Range, Generalization • Example: • Subproperty (a) • Zimmermann et al.: A general framework for representing, reasoning and querying with annotated Semantic Web data. Journal of Web Semantics 11, 72-95 (2012)

  5. Background: MapReduce

  6. Naïve Implementation • Subproperty (a) (P,sp,Q) : (X, P, Y) : Mapper Mapper Mapper Reducer Reducer Reducer (X,Q,Y) :

  7. Challenges and solutions • Generalization Rule • Delete triples from the data set • Large data reconstruction cost • Solution • Only perform at the beginning and at the end • Combine Generalization Rule with other rules • E.g. when a reducer generates and , it generates instead.

  8. Challenges and solutions (cont’d) • Unnecessary Derivation • E.g. • Waste a lot of computation time • Solution • Incorporate the annotation into mapped key • E.g. • Map to ((t1, p), (1, s,o, [1,2])) • Map to (t3, p), (2, q, [3,4])) • They will not be grouped together!

  9. Challenges and solutions (cont’d) • Fixpoint Calculation • Subproperty/subclass rules require fixpoint iteration • Solution • Load subproperty/subclass schema triples into memory • Calculate the closure • Shortest path calculation Floyd-Warshall style algorithm … “Shortest” path

  10. Experiment setup • Dataset • FuzzifiedDBPedia core ontology • fpdLUBM1000, 2000, 4000, 8000 • Cluster • 25 machine with 75 mapper/reducer slots • Liu et al.: Reasoning with Large Scale Ontologies in Fuzzy pD* Using MapReduce. Computational Intelligence Magazine, IEEE 7(2), 54-66 (2012)

  11. Experiment result - fuzzy DBPedia Dataset: fuzzifiedDBPedia core ontology Results:

  12. Experiment result – fpdLUBM Experimental results of FuzzyPD and WebPIE

  13. Experiment result– fpdLUBM (cont’d) Scalability over number of units

  14. Experiment result– fpdLUBM (cont’d) Scalability over number of units

  15. Experiment result– fpdLUBM (cont’d) Scalability over data volume

  16. Conclusion and Future work • We show how to design MapReduce algorithms to achieve scalable annotated RDFS reasoning • Several challenges along with solutions • Future work • More experiments on annotated RDFS ontologies • Annotated OWL 2 RL

  17. Q&A

More Related