1 / 10

D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia

MOMIS Query Manager Prototipo di un query manager per la gestione di query globali. D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia. D2I Integrazione, Warehousing e Mining di sorgenti eterogenee

josie
Download Presentation

D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MOMIS Query Manager Prototipo di un query manager per la gestione di query globali D. Beneventano, S. Bergamaschi, F. Mandreoli Università degli Studi di Modena e Reggio Emilia D2I Integrazione, Warehousing e Mining di sorgenti eterogenee Tema 1: Integrazione di dati provenienti da sorgenti eterogenee ROMA, 11 OTTOBRE 2002

  2. Example Local classes (relational) L1(firstn,lastn,year,e_mail) L2(name,e_mail,dept_code,s_code) INTEGRATION Global Class: G Global Class Schema: G S(G) = (Name,E_mail,Year,Dept,Section) Local Class Schemata w.r.t. Global Class: S(L1) = (Name,E_mail,Year) S(L2) = (Name,E_mail,Dept,Section)

  3. Data cleaning and reconciliation • Integration at the extensional level • the data returned by various sources need to be converted/reconciled • interpretation and merging of the data provided by the sources Schema Translation • (example: firstn and lastnto Name) Data conversion • (example: ‘Rita’ + ‘Verde’ to ‘Rita Verde’) L1 L2

  4. Redundancy and Reconcilation Hypothesis Instances of the same object in different local class must have the same value for a common attribute L2 L1 O1 O O2 L2 L1 O1 O O O2

  5. Object fusion To identify instances of the same object and fuse them:JoinMap - join criteria among classes L2 L1 O1 O O2 O1 O O O2 JoinMap JM(L1,L2) L1.Name=L2.Name

  6. Object fusion : indirect map L1 L2 O1 O2 O3 O1 O2 O2 O3 JoinMap JMCS.S,UNI.RS

  7. Global Class Instance • GAV with “Single database property” • (Lenzerini - Data Integration: A Theoretical Perspective, PODS 2002) • The computation is based on “FULL DISJUNCTION” • (Rajarama, Ullman - Integrating Information by Outerjoins • and Full Disjunctions. PODS 1996) • “Computing the natural outerjoin of many relations in a way that preserves all possible connections amon facts” L1 L2 G: select S(G) from L1 outer join L2 on JM(L1,L2) G

  8. FULL DISJUNCTION COMPUTATION • Question: when a full disjunction can be computed by some sequence of natural outerjoins • Answer: there is a natural outerjoin sequence producing the full disjunction if and only if the set of relation schemes forms a connected, -acyclic hypergraph (Fagin - 1983) A Global class with n local classes, n >2 :-cyclic hypergraph L1 JM(L1,L3) JM(L1,L2) New Method JM(L2,L3) L3 L2 Example: n = 3 : G: select S(G) from (L1 outer join L2 on JM(L1,L2)) outer join (L1 outer join L3 on JM(L1,L3)) on JM(L2,L3)

  9. Query rewiting method Global query (in DNF) : Q1 Local query for the class L : Q1_L where-condition of Q1_L : all factors of DNF which can be solved in L residual factors of Q1 : factors not included in all local where-condition select-list of Q1_L : attributes of the select-list of Q1 + residual factors +JoinMap Global query reformulation full disjunction based on the JoinMap + residual factors

  10. Query rewiting example Global query Q1: select E_mail from G where (E_mail like ’*.it' and Dept='Dept1') or (E_mail like ’*.it' and Year=2) Local queries Q1_L1: select Name, Year, E_mail from L1 where (E_mail like ’*.it' or Year=2) Q1_L2: select Name, Dept, E_mail from L2 where (E_mail like ’*.it' or Dept='Dept1') Global query reformulation: Q1: select E_mail from Q1_L1 outer join Q1_L2 on JM where (Dept='Dept1' or Year=2) residual factor

More Related