1 / 10

Research Meeting

Research Meeting. 2009-07-30 Jaeseok Myung. SPARQL Processing with MR Framework. SPARQL Query. SELECT ?name ? mbox WHERE { ?x foaf:name ?name . ?x foaf:mbox ? mbox . FILTER regex (?name, “^Tim”) && regex (? mbox , “w3c”) } ORDER BY ?name LIMIT 5. MR Framework. Result.

topper
Download Presentation

Research Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Meeting 2009-07-30 JaeseokMyung

  2. SPARQL Processing with MR Framework SPARQL Query SELECT ?name ?mboxWHERE{ ?xfoaf:name ?name. ?xfoaf:mbox ?mbox. FILTERregex(?name, “^Tim”) && regex(?mbox, “w3c”) } ORDERBY ?nameLIMIT 5 MR Framework Result SPARQL Algebra MR_SLI (slice _ 5 (project (?name ?mbox) (order (?name) (filter (&& (regex ?name "^Tim") (regex ?mbox "w3c")) (bgp (triple ?x <http://xmlns.com/foaf/0.1/name> ?name) (triple ?x <http://xmlns.com/foaf/0.1/mbox> ?mbox) ))))) MR_PRJ MR_ORD MR_FIL MR_BGP Center for E-Business Technology

  3. SPARQL Algebra SELECT ?name ?mboxWHERE { ?xfoaf:name ?name. ?xfoaf:mbox ?mbox. FILTERregex(?name, “^Tim”) && regex(?mbox, “w3c”) } ORDERBY ?nameLIMIT 5 Projection BGP Filter OrderBy Slice Center for E-Business Technology

  4. Basic Graph Pattern SELECT?x ?y ?z WHERE{ ?x type GraduateStudent. ?y type University. ?z type Department. ?x memberOf?z. ?z subOrganizationOf?y. ?x undergraduateDegreeFrom?y. } Dependency between TPs 1 ?x ?x (project (?x ?y ?z) (bgp (triple ?x <type> <GraduateStudent>) (triple ?y <type> <University>) (triple ?z <type> <Department>) (triple ?x <memberOf> ?z) (triple ?z <subOrganizationOf> ?y) (triple ?x <undergraduateDegreeFrom> ?y) )) ?x 6 4 1 2 3 4 5 6 ?y ?z ?z ?y 2 5 3 ?y ?z Center for E-Business Technology

  5. Basic Graph Pattern SELECT?x ?y ?z WHERE{ ?x type GraduateStudent. ?y type University. ?z type Department. ?x memberOf?z. ?z subOrganizationOf?y. ?x undergraduateDegreeFrom?y. } Optimization => Dependency => Selectivity Estimation => Ordering 1 ?x ?x (project (?x ?y ?z) (bgp (triple ?x <type> <GraduateStudent>) (triple ?y <type> <University>) (triple ?z <type> <Department>) (triple ?x <memberOf> ?z) (triple ?z <subOrganizationOf> ?y) (triple ?x <undergraduateDegreeFrom> ?y) )) ?x 6 4 1 2 3 4 5 6 ?y ?z ?z ?y 2 5 3 ?y ?z Center for E-Business Technology

  6. Basic Graph Pattern - MR SELECT?x ?y ?z WHERE{ ?x type GraduateStudent. ?y type University. ?z type Department. ?x memberOf?z. ?z subOrganizationOf?y. ?x undergraduateDegreeFrom?y. } MapReduce => Parallel & Distributed Processing 1 ?x ?x (project (?x ?y ?z) (bgp (triple ?x <type> <GraduateStudent>) (triple ?y <type> <University>) (triple ?z <type> <Department>) (triple ?x <memberOf> ?z) (triple ?z <subOrganizationOf> ?y) (triple ?x <undergraduateDegreeFrom> ?y) )) ?x 6 4 1 2 3 4 5 6 ?y ?z ?z ?y 2 5 3 ?y ?z Center for E-Business Technology

  7. MapReduce • Distributed Processing Framework • Proposed for parallel processing of large data sets • Processing Flow • Map(k,v) -> list(k’, v’) • Reduce(k’, list(v’)) -> list(v’’) • WordCountMapReduce Center for E-Business Technology

  8. MR for BGP public void map(LongWritable key, Text val, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { Triple triple = val.toTriple(); if(triple satisfies s, p, o of triple patterns) { output.collect(corresponding variables, triple); } // 즉 해당 triple이 pattern을 만족시킨다면 (var, triple) 형태로 저장 } public void reduce(Text key, Iterator<IntWritable> vals, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { // ???? } Center for E-Business Technology

  9. Basic Graph Pattern - MR Query Algebra Result MR_BGP MR_BGP 1 1 6 6 4 4 2 2 5 5 3 3 Distributed File System Center for E-Business Technology

  10. SPARQL Processing with MR Framework • SPARQL Algebra에 대한 MR 처리는 BGP 예에서 볼 수 있듯이 매우 Promising함 • ToDo • 구현 • 실험계획 • 구현은 어떻게? • MR로 구현함에 있어서 생각해볼 문제 있음 • M에서 트리플을 변수 별로 나누고 • R에서 조인을 수행 Center for E-Business Technology

More Related