1 / 21

Answering Queries Using Views

Answering Queries Using Views. Presented by: Mahmoud ELIAS. Plan. Introduction Motivations Views Problem definition When is a view usable for a query ? Answering queries using views for Data Integration Conclusions Bibliography. Introduction.

orli
Download Presentation

Answering Queries Using Views

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Answering Queries Using Views Presented by: Mahmoud ELIAS

  2. Plan • Introduction • Motivations • Views • Problem definition • When is a view usable for a query? • Answering queries using views for Data Integration • Conclusions • Bibliography

  3. Introduction • Relevance to a wide variety of data management problems • Example: University schema: • Prof(name, area) • Course(c-number, title) • Teaches(prof, c-number, quarter, evaluation) • Registered(student, c-number, quarter) • Major(student, dept) • WorksIn(prof, dept) • Advises(prof, student)

  4. Motivations • Query optimization: • Speed up the computation of the query. • Be attention of indexes. • Maintaining physical Data independenceVs • Data Integration: • uniform query interface to a multitude of autonomous heterogeneous data sources

  5. Motivation (cont.) • Data warehouse design: • we must be able to answer all the required queries over the warehouse using only these views • Semantic data caching • Check whether the cached results of a previously computed query can be used for a new query

  6. Building views System analysis The user will never be satisfied feedback statistics Choosing and creating views

  7. Problem definition • A view is a derived relation defined in terms of stored base relations. • A query Q1 is said to be containedin a query Q2 (Q1 Q2), if for all databases D, the set of tuples computed for Q1 is a subset of those computed for Q2 • Two queries are said to be equivalentif Q1 Q2and Q2 Q1 • Given a query Q and a set of view definitions V1,V2,…,Vm, arewritingof the query using the views is a query expression Q’ that refers only to the given views. • Equivalent rewriting vs. maximally-contained rewriting

  8. When is a View usable for a Query? • A view can be useful for a query if the set of relations it mentions overlaps with that of the query, and it selects some of the attributes selected by the query. • When the view contains grouping and aggregation but the query does not, then unless the query removes duplicates in the select clause, the view cannot be used to answer a query.

  9. Example1: • select Advises.prof, Advises.student, Registered.quarter from Registered, Teaches, Advises where Registered.c-number=Teaches.c-number and Registered.quarter=Teaches. quarter and Advises.prof=Teaches.prof and Advises.student= Registered.student and Registered.quarter ≥ « winter98 » • create view V1 as select Registered.student, Teaches.prof, Registered.quarter from Registered, Teaches where Registered.c-number=Teaches.c-number and Registered.quarter=Teaches. quarter and Registered.quarter ≥ « winter97 »

  10. Example2: • create view V2 as select c-number, year, max(evaluation) as maxeval, count(*) asofferings from Teaches where c-number ≥ 400 groupBy c-number, year • select year, count(*), max(evaluation) from Teaches where c-number ≥ 500 groupBy year • select year, sum(offerings), Max(maxeval) from V2 where c-number ≥ 500 groupBy year

  11. Conjunctive queries head body • Safe - each variable in the head appears in the body. • No arithmetic comparisons in predicates. subgoal

  12. The Bucket algorithm • Nb of query rewritings that need to be considered can be drastically reduced if we first consider each subgoal in the query in isolation, and determine which views may be relevant to each subgoal. • 2 steps: • Create a bucket for each subgoal in Q • Queryrewritings that are conjunctive queries, each consisting of one conjunct from every bucket.

  13. Example V1(st,c-n,qu,ti) :- Registered(st,c-n,qu), Course(c-n,ti), c-n ≥ 500, qu ≥ Aur98 V2(st,pr,c-n,qu) :- Registered(st,c-n,qu), Teaches(pr,c-n,qu) V3(st,c-n) :- Registered(st,c-n,qu), year ≤ Aut94 V4(pr,c-n,ti,qu) :- Registered(st,c-n,qu), Course(c-n,ti), Teaches(pr,c-n,qu), qu ≤ Aut97 Q(S,C,P) :- Teaches(P,C,Q), Registered(S,C,Q), Course(C,T), C ≥300, Q ≥ Aut95

  14. Example (cont.) All combinations: q’(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’) , V1(S’,C,Q’,T)  q’(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T') q’(S,C,P) :- V4(P,C,T’,Q), V1(S,C,Q,T’), V4(P’,C,T,Q’) q’(S,C,P) :- V2(S,P,C,Q), V4(P,C,T’,Q) The algorithm produces a maximally-contained rewriting

  15. The Inverse-rules algorithm • Construct a set of rules that invert the view definitions V3(dept, c-number) :- Major(student, dept), Registered(student,c-number) Major(f1(dept,X), dept) :- V3(dept,X) Registered(f1(Y,c-number),c-number) :- V3(Y,c-number) Q(dept) :- Major(student,dept), Registered(student,444) V3 = {(CS, 444), (EE, 444), (CS,333)}  Registered = {(f1(CS,444), CS), (f1(EE,444), EE), (f1(CS,333), CS)} Major = {(f1(CS,444), 444), (f1(EE,444), 444), (f1(CS,333), 333)}  Q = {CS, EE} • more efficient rewriting  unfold the inverse rules and remove redundant subgoals from the unfolded rules.

  16. Bucket vs. Inverse-rules • Both algorithms produce a maximally-contained rewriting • Computing buckets is similar in spirit to that of computing the inverse rules: compute the views that are relevant to single atoms of the DB relations. • The Bucket algo. Computes the relevant views by taking into consideration the context in which th atom appears in the query. • The inverse rules can be computed once, and be applicable to any query.

  17. The MiniCon algorithm • It addresses the limitations of previous algorithms • instead of building rewritings by combining rewritings for each of the query subgoal or the DB relation, we consider how each of the variables in the query can interact with the available views (MiniCon Description MCD) Q(D) :- Major(S,D), Registered(S,444,Q), Advises(P,S) V1(dept) :- Major(student,dept), Registered(student,444,quarter) V2(prof,dept,area) :- Advises(prof,student), Prof(name,area) V3(dept,c-number) :- Major(student,dept), Registered(student,c- number,quarter), Advises(prof,student)

  18. Results • The key advantage of the MiniCon algorithm is that the second phase of the algorithm considers much fewer combinations of MCDs compared to the Cartesian product of the buckets or compared to the number of unfoldings of inverse rules.

  19. Conclusions • Using views to answer queries is an important problem. Especially for information integration on the web. • Query containment and containment mappings provide the key for solving the problem.

  20. Conclusions (cont.) • The variants of the problem are NP-complete. This is not too bad, since queries are usually short. • In many practical cases, there is an algorithm for solving the problem.

  21. Bibliography [Lev00]Alon Y. Levy. Answering Queries Using Views: A Survey. Department of Computer Science and Engineering, University ofWashington, pages 1-43, 2000 [Mit99]Prasenjit Mitra. An Algorithm for Answering Queries Efficiently Using Views. Infolab, Stanford University, pages 1-13, September,1999. [DG97b] Oliver M. Duschka and Michael R. Genesereth. Query Planning inInfoMaster. In Proceedings of the ACM Symposium on Applied Computing, San Jose, CA, 1997.

More Related