1 / 6

Projects Overview

Projects Overview. 1. (T+P) Data Integration/Mediation Implement the Global-as-View (GAV)/ view unfolding approach (intro to it: TODAY) Examples, documentation, presentation (needed for all projects) 2. (T: deep) Data Integration/Mediation

gari
Download Presentation

Projects Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Projects Overview • 1. (T+P) Data Integration/Mediation • Implement the Global-as-View (GAV)/ view unfolding approach (intro to it: TODAY) • Examples, documentation, presentation (needed for all projects) • 2. (T: deep) Data Integration/Mediation • Read/learn about other DI algorithms: Local-as-View (LAV), mixed approaches • Summarize, report, present • 3. (T: broad) Intro/Overview on Schema Matching approaches • 4. (T) Knowledge Representation & Ontologies of Biological Information • Gene Ontology, BioCyc, Pathway databases, …

  2. Projects Overview • 5. (P) Scientific Workflows/ Kepler • (5A) Analysis-intensive workflows (focus on data analysis, some data transformations, some visualization) • (5B) Data-intensive workflows / SDSC Storage Resource Broker (collection management on the “Data Grid”) • (5C) Compute-intensive workflows / Condor, NIMROD, APST, … (job scheduling on cluster computers, simulations, “Compute Grid”) • (5D) Database-intensive workflows (TBD) • (5E) Real-time data access workflows / ROADNet (accessing real-time data streams, some analysis and visualization)

  3. 1. Query Q ( G (S1,..., Sk) ) 6. {answers(Q)} Integrated Global (XML) View G Integrated View Definition G(..) S1(..)…Sk(..) MEDIATOR 2. Query rewriting 5. Post processing (XML) View (XML) View (XML) View web services as wrapper APIs Wrapper Wrapper Wrapper S1 S2 Sk Data Integration (Mediator System) USER/Client 3. Q1 Q2 Q3 4. {answers(Q1)} {answers(Q2)} {answers(Q3)}

  4. Global-as-View (GAV) & View Unfolding • In GAV data integration, rules have the form • R_G(…)  … R_L(…) … • The user query is against the global (=integrated) views (relations) R_G: • ?- R_G( …) • The system must come up with a query plan that eliminates references to R_G relations, and only keeps R_L relations (and maybe new aux-relations)  GAV query rewriting • Essentially “view unfolding”

  5. Global-as-View (GAV) & View Unfolding • Simplistic example: • User query: ?- ans(X) • View definitions • ans(X)  p(X,X), u(X,Y). • p(X,Y)  q(X,Z), r(Z,Y) • Algorithm: • Given a goal (rhs) G, unify it with a matching head (lhs) H of a rule H(ead)  B(ody) • Replace G by B, taking into account the substitution  needed for unifying G and H • Repeat until a all relations are given (=EDB) relations

  6. Querying vs Reasoning Q1: answer1(S,C)  student(S, N), takes(S, C), course(C, X), inCS(C), course(C, “DB”) Q2: answer2(S,C)  student(S, N), takes(S, C), course(C, X) • We can “run” both queries Q on a given database instance D  Query evaluation (answer = eval(Q,D)) • Note: The answers to Q1 are always contained in the answers to Q2 (why?) • Determining whether a query Q is contained (for all database instances D!) in another query Q’ is a reasoning problem.

More Related