1 / 20

Presented by Jiwen Sun, Lihui Zhao 24/3/2004

Schema Mediation in Peer Data Management Systems ( Alon Y. Halevy , Zachary G. Ives , Dan Suciu , Igor Tatarinov ). Presented by Jiwen Sun, Lihui Zhao 24/3/2004. Introduction. Why Peer to Peer Integration with Semantics Flexibility, extensible The paper’s Contribution Piazza Project

kitra
Download Presentation

Presented by Jiwen Sun, Lihui Zhao 24/3/2004

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Schema Mediation in Peer Data Management Systems(Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov ) Presented by Jiwen Sun, Lihui Zhao 24/3/2004

  2. Introduction • Why Peer to Peer • Integration with Semantics • Flexibility, extensible • The paper’s Contribution • Piazza Project • A peer mapping language • Algorithm for query answering

  3. Introduction • Traditional Integration Formalisms • Global as View (GAV) • Mediated schema as views over data sources • Local as View (LAV) • Data sources as views of mediated schema GAV: T :- S1, S2, S3 LAV: S1  T Med. Schema T S1 S2 S3

  4. Introduction • GAV and LAV in Piazza (P2P) environment • Define semantic relations locally • Answer queries globally

  5. Peer Peer Relations Stored Relations Introduction • Properties of a peer Peer Description Peer Description Storage Description

  6. Introduction • Emergency Response Example

  7. Problem Definition • PPL – Peer-Programming Language • Storage Description - Mappings between stored relations and peer relations A : R  Q • Peer Mapping - Mappings between peer relations • Inclusion: Q1(A1) Q2(A2) • Definitional: P(x) :- P1(x)

  8. Problem Definition • GAV-like Definition – Definition in Datalog 9DC : SkilledPerson(PID, “Doctor”): - H :Doctor(SID, h, l, s, e) 9DC : SkilledPerson(PID, “EMT”) : - H : EMT(SID, h, vid, s, e) 9DC : SkilledPerson(PID, “EMT”) : - FS : Schedule(PID, vid), FS : FirstResponse(vid, s, l, d), FS : Skills(PID, “medical”)

  9. Problem Definition • LAV-like Definition – Inclusion Definition LH : CritBed(bed, hosp, room, PID, status)  H : CritBed(bed, hosp, room), H : Patient(PID, bed, status) LH : EmergBed(bed, hosp, room, PID, status)  H : EmergBed(bed, hosp, room), H : Patient(PID, bed, status)

  10. Problem Definition • Query Answering in a PDMS • A peer answers a query with local stored data • Reformulate the query and forward to neighbour peers • Query is answered by chaining of mapped peers • Mappings is expanded with a rule-goal tree

  11. Complexity of Query Answering • Restrictions on peer mappings decides complexity • In general, finding all certain answers is undecidable. • Acyclic Peer Mappings • Only Inclusion Mappings are used => polynomial time • Cyclic Peer Mappings • Replication type cycle only => polynomial time • Comparison Predicates helps reduce complexity

  12. Query Reformulation Algorithm • AlgorithmOverview • Building a rule-goal tree • Expand tree by combining and interleaving GAV and LAV • Leaves in Storage Description forms are the query results • Tree size may be huge

  13. Q(f1,f2) Q(f1,f2) q q SameEngine(f1,f2,e) Skill(f1,s) Skill(f2,s) Building a rule-goal tree 1.Make the query root of tree 2. Find views cover the query, expand the query use the views Q(f1,f2) :- SameEngine(f1,f2,e),Skill(f1,s),Skill(f2,s)

  14. Q(f1,f2) q r0 r1 r1 SameSkill(f2,f1) AssignedTo(f1,e) AssignedTo(f2,e) SameSkill(f1,f2) Building a rule-goal tree 3. Mappings between peer schemas r0: SameEngine(f1, f2, e) :- AssignedTo(f1,e), AssignedTo(f2,e) r1: SameSkill(f1, f2)Skill(f1,s), Skill(f2,s) SameEngine(f1,f2,e) Skill(f1,s) Skill(f2,s)

  15. Q(f1,f2) q r1 r1 r3 r3 r2 r0 r2 SameSkill(f2,f1) AssignedTo(f1,e) AssignedTo(f2,e) SameSkill(f1,f2) SamEngine(f1,f2,e) Skill(f1,s) S1(f2,e,_) S1(f1,e,_) S2(f1,f2) S2(f2,f1) Skill(f2,s) Building a rule-goal tree 4. Repeat until all leaves are storage relations Reformulated query: Q’(f1,f2) :- S1(f1,e,_), S1(f2,e,_), S2(f1,f2)  S1(f1,e,_), S1(f2,e,_), S2(f2,f1)

  16. Query Reformulation Algorithm • Optimizations • Techniques for Pruning Rule-goal Tree Branches • Memorization of nodes • Constraint on nodes, which contradict query • Redundancy detection • Maximizing the techniques • Orderfor building tree is important • Prioritize node in Piazza system

  17. Experiment • Bottleneck is finding rewritings from tree • Tree depth matters, not number of nodes

  18. Related Work • Answering Queries Using Views (AQUV) • Answering queries using views [Halevy] • Minicon: A scalable algorithm for answering queries using views [Pottinger & Halevy] • PDMS vs. Database Federation • DB federation – mapping between stored relations • Loose relationship => scales better • Peers can play different roles • Chaining through peer mappings to locate data

  19. Summary • PDMS is superior over data integration systems • Ad-hoc, scalable • Decentralized • PPL describes mappings using GAV/LAV • A query reformulation algorithm produces practical results

  20. Thank You!

More Related