1 / 27

Answering Queries using Templates with Binding Patterns

Answering Queries using Templates with Binding Patterns. Anand Rajaraman Yehoshua Sagiv Jeffrey D.Ullman Presented By Sreekrishna Nallela. Motivation. Tsimmis translates information sources of arbitary type into a common data model and language.

Download Presentation

Answering Queries using Templates with Binding Patterns

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Answering Queries using Templates with Binding Patterns Anand Rajaraman Yehoshua Sagiv Jeffrey D.Ullman Presented By Sreekrishna Nallela

  2. Motivation • Tsimmis translates information sources of arbitary type into a common data model and language. • The source could be a collection of ASCII documents, a spreadsheet file and so on. • In these situations we must integrate the source with common language and model. • The approach taken by Tsimmis is to use a translator generator.

  3. Continued • The input to the generator is a list of rules each consisting of : • i)A Query template • ii)A sequence of operations that must be carried out to answer any query obtained by instantiating the template. • Example 1:Suppose the information source is genealogy then it can answer only two query forms: • 1.Given any individual C,find C’s parents. • 2.Find all the individuals who have parents specified by the information source.

  4. Continued.. • We can also answer some queries that are not of the form (1) or (2). • First consider query “Find the grandparents of individual a.”This query is answered by: • i)Find the set P of parents of a,using query(1). • ii)For each individual p in set P,use query(1) to find the parents of p. • iii)The answer is the union of all the individuals found in step(ii) • This can be depicted in the figure 1

  5. Query solving • Composing parent with itself. • Now ,Consider the query”Find the grand children of individual g.” • We cannot answer this query if we only have query(1) to work with ,with the help of query(2) we can answer this query as follows.

  6. Grandchild query i)Use (2) to find all individuals. ii)Use (1) to find the parents of the individuals found in (i). iii)Use(1) to find the parents of the individuals found in (ii). iv)Find those individuals from (iii) such that g is one of their grandparents.

  7. A Formal Model Example 1 suggests the following way to model a set of given query forms 1.We assume that there are certain predicates about which queries are posed. In Example 1,parent(C,P) would be such a predicate. 2.We assume that there are certain query templates ,which we call views. Views consists of a “head” and a “body”. Head consists of : • A predicate denoting the view, • Arguments for the predicate,and • A binding pattern. The body of the view is a program that produces a result to the query.

  8. Example 2:Consider example 1.The two queries become two views ,the parameters are bound arguments and the results are the free arguments • We assume parent is the predicate representing child-parent information.Now we can write query(1) as • here v1 expects a binding for c,which becomes the first argument of parent and values become second argument. • Similarly,we can write query(2) as

  9. Queries • A Query is denoted exactly as a view. • It has a head with binding pattern and body. • Body is a program over the predicates that represent the information of the source. • Example3:In Example1,to find grandparents of individual C we can express it as • In this binding of C is given and asked to find related values of G.

  10. Queries Continued • The Query • Looks similar but it expresses more difficult query. • In this we are given a grandparent and asked to find the grand children.

  11. Valid Solutions • A query is solved by a program that uses views to obtain answers. • In general the program can create its own data, corresponding to IDB predicates in data log programs. • Programs can use arithmetic comparisons or other predicates that have a meaning independent of any information source.

  12. Conjunctive Queries as Solutions • The simplest case is when the views and solutions are conjunctive queries. • Subgoals have view names as predicates and we assume they are evaluated in left-to-right-order. • For a solution to be valid it must satisfy two conditions: • 1.The binding patterns must be appropriate. • 2.The expansion of the solution must be equivalent to the given query.

  13. Continued • Example 4:Considerexample 2 with views and and the query from example3.An appropriate solution is This solution satisfies the binding pattern condition. We must also expand the above solution to check that it is equivalent to the given query.From example2 replace Now the above one becomes The conjunctive query is identical to the query from example 3,so it is equivalent to that query,so the solution is valid.

  14. Conjunctive queries continued • Now consider the second query from example 3,where we are asked to find grandchildren.the following is not a valid solution • Using view V2 we can produce a valid solution: • When we expand,we get

  15. Testing for Solutions to Conjunctive queries Lemma 1:If Q is a conjunctive query with n subgoals and m different variables,and we are given a collection of views that are conjunctive queries with binding patterns for the head,then if there is a conjunctive query solution to Q there is a solution with at most m+n subgoals,using at most m different variables.

  16. continued • The above figure suggests the query Q,some solution S,and the expansion E of that solution. • We also show containment mappings h from Q to E and g from E to Q,that together demonstrate . • First we group variables of E into equivalence classes by defining if • There are “m” classes , since that is the number of variables of Q. • We need to shorten S so it has atmost m+n subgoals.

  17. Continued • We can retain subgoal G of S if either 1.One of the subgoals of E that come from G is the target of a subgoal of Q under the mapping h. 2.One of the variables X in G is the leader of its equivalence class of variables. • We construct S’ from S by: • Delete from S any subgoal that does not meet at least one of the above two conditions.Now we are left with at most m+n subgoals. • Replace all variables in an equivalence class by their leader.Now we have at most m variables.

  18. Continued • Finally we must show that expansion of S’ ,called as E’ is equivalent to Q. • E’ differs from E in three ways: 1. Some local variables may be different. 2. Subgoals in E may be missing in E’. 3.Variables that are distinct in E may be identical in E’. • This change cannot prevent h from being a containment mapping,although some variables of Q may change. • This may prevent g from being a containment mapping,but we are equating only variables that had the same image under g.

  19. Theorem 1: • There is a nondeterministic polynomial-time algorithm that,given views defined by conjunctive queries and binding patterns and a query of same type,decides if there is a conjunctive query solution to the query for these views and finds it if so. • Guess a solution that has no more subgoals than the sum of the number of subgoals and variables in the query and uses no more variables than does the query. • Check that the binding pattern condition is satisfied in nondeterministic polynomial time. • If satisfied guess their containment mappings and check their correctness.

  20. Theorem 2: • If there is a valid datalog program solution to the problem of theorem 1,then there is a valid conjunctive-query solution,and it can be found in NP-time. • We can expand a datalog program into a union of conjunctive queries. • Each of these must satisfy the binding –pattern condition for conjunctive queries.

  21. Lemma 2: • There is a doubly exponential function f(s) such that if “s” is the size of the given query Q and set of views, there is a solution S for Q in terms of the views of size at most f(s). • There are atmost s variables in Q,so atmost s! order of these variables. • For each order there may or maynot be equality between each pair of consecutive variables. • The number of orders with equality is at most ,an exponential in s. • For each of these orders there must be some containment mapping from E to Q.

  22. Continued • Now define two variables of E to be equivalent if they map to same variable of Q for every order. • It contains at most classes. • Modify S by identifying variables that are equivalent. • Eliminate all but the first of identical sub goals in S,to form S’. • S’ has at most subgoals.There are at most s arguments to any view ,so the variables can be chosen in ways. • If C(s) is doubly exponential then is also doubly exponential. • f(s) is the length of a solution with subgoals. • f(s) will also be doubly exponential and is bound on the length of S’ .

  23. continued • The binding pattern requirements of S’ are satisfied. • Finally because E’ is E with additional restrictions. • Since we assumed , • In the above equality all three of Q,E and E’ are equivalent proving that S’ is solution.

  24. Theorem 3: • If Queries, views and solutions are conjunctive Queries with comparisons and binding patterns ,there is a nondeterministic ,doubly exponential-time decision procedure to find a solution to a given query or determine that none exists. • Guess a solution S of size at most f(s). • f is a function described in Lemma 2. • S is the size of the query and views. • Expand S and guess containment mappings between the query and the expansion.

  25. Conclusions • The problem of answering a query using query templates are introduced. • Some preliminary results are presented showing that the decision problem is decidable for conjunctive queries. • Bounds on the size of the solution are proved

  26. Questions

  27. Thankyou

More Related