1 / 18

# Query languages II: equivalence & containment (Motivation: rewriting queries using views) - PowerPoint PPT Presentation

Query languages II: equivalence & containment (Motivation: rewriting queries using views). conjunctive queries – CQ’s Extensions of CQ’s. Conjunctive queries –equivalence & containment. For CQ’ q1, q2, with the same head predicate: Decision problems :

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Query languages II: equivalence & containment (Motivation: rewriting queries using views)' - cally-willis

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Query languages II: equivalence & containment(Motivation: rewriting queries using views)

conjunctive queries – CQ’s

Extensions of CQ’s

conjunctive-ii

For CQ’ q1, q2, with the same head predicate:

Decision problems:

The two problems are equivalent: solved one, solved the other

conjunctive-ii

Solution for containment  for equivalence :

Solution for equivalence for containment:

(here, the ri and sj are db predicates, not necessarily different)

conjunctive-ii

Characterizations for containment : assume q1, q2 are given

A mapping h from the variables of q2 to variables/constants (extended naturally to constants and atoms) is a homomorphism from q2 to q1 if

• Maps each atom of q2 to an atom of q1

• If there are constrains on the side, Ci in qi, then h(C2) is implied by C1

Notation:

conjunctive-ii

Thm: The following are equivalent: for CQ’s w/o built-in preds

Proof: (ii)  (i) is easy (and holds even with b.i. preds):

Every valuation from q1 into a db D can be composed with h to a valuation from q2. Hence, every answer of q1 on D is also an answer of q2 on D

h

v

D

conjunctive-ii

For (i)  (ii):

The body of a CQ (w/o b.i’s) can be viewed as a db:

• consider each variable as a constant, different from all constants in the CQ and the other variables

• or, replace each variable x by a distinct constant cx

Denote this db by db(q)

Obviously, q(db(q)) contains the head of q (or its image)

Example:

Q: q(d) :- movies(t,d,a), directory(‘Plaza’, t, 19:30)

db(Q): movies(ct,cd,ca), directory(‘Plaza’, ct,19:30)

Obviously, applying Q to this db, one obtains q(cd) (use the “identity” valuation)

conjunctive-ii

•  (ii) (q2 contains q1  homomorphism from q2 to q1)

The valuation from q2 to db(q1) that yields this answer is a homomorphism

Example:

q1: p(d) :- movies(t,d,’Jane’), directory(‘Plaza’, t, 19:30), location(‘Plaza’, a, 01-58776655)

q2: p(z) :- movies(t,z,a), directory(‘Plaza’, t, 19:30)

Obviously, q1 is contained in q2, with h: t t, zd, a’Jane’,

that maps the two atoms of body(q2) to the first two of body(q1), and head(q2) to head(q1)

conjunctive-ii

Because of this characterization, such a homomorphism is also called a containment mapping from q2 to q1

Intuition: q1 is contained in q2 iff

• It has ‘same or more atoms’

• It may have some constants where q2 has variables

conjunctive-ii

Another characterization also called a :

For a rule p(..) :- r1(..), …, rk(..)

a model is a set of facts over p, r1, .., rk that satisfies the rule as a logical formula (assuming all variables are universally quantified)

Thm: the following are equivalent:

The important & useful characterization: homomorphism, i.e., containment mapping

conjunctive-ii

Algorithm and complexity also called a :

• To decide if q1 is contained in q2, search for a containment mapping from the variables of q2 to the variables and constants of q1: easy & fast in many cases, exponential in worst case

• The containment is in NP:

given a mapping on the variables of q2 , it is easy to check it is a homomorphism to q1

conjunctive-ii

• It is NP-hard: also called a

given a graph G, it is 3-colorable iff there is a homomorphism from G (represented as an edge relation) to the 3-clique

one can represent G as the body of q2 (using distinct variables for distinct nodes), the 3-clique as the body of q1

for both, the head can be q( )

• Hence, containment & equivalence are NP-complete(even for queries with no head variables)

Note: this is expression complexity, not data complexity (here there is no db actually)

*(when such a query is applied to a db, it returns either {()}, or {})

*

conjunctive-ii

Minimization of CQ’s also called a :

For q, define a minimal equivalent query as any equivalent q’ with a minimal number of body atoms

Thm: the minimal equivalent query of q

• is unique up to isomorphism,

• and can be obtained by removing some atoms from body(q)

Proof:

conjunctive-ii

Thus, for every CQ Q, there is a subset of the body that gives a minimal equivalent query

Called a core of Q

It is not necessarily unique, (different subsets may yield cores), but all cores are isomorphic

conjunctive-ii

Containment & equivalence for extensions of CQ’s gives a minimal equivalent query

Extension to UCQ’s : let

Thm:

Proof:  is obvious

: if q1 is contained in q2, then each ri is contained in q2

• q2(db(ri)) contains p(x)

• for some sj, sj(db(ri)) contains p(x)

 sj contains ri

q1: r1: p(x) :- body1,1

rk: p(x):- body1,k

q2: s1: p(x) :- body2,1

sm: p(x):- body2,m

conjunctive-ii

Containment algorithm gives a minimal equivalent query:

For each ri, loop over sj, and search for a containment mapping from sj to ri

Still exponential in size (of both queries)

Complexity :

The containment problem is now

Explanation:

A relation R(..) is ptime if membership can be verified in ptime

conjunctive-ii

For a UCQ Q we can also consider the canonical db of Q, denoted db(Q), obtained by taking the bodies of all the rules together as a db (with different existential variables in different rules )

Here also:

Thm:

Q1 is contained in Q2 iff Q2(db(Q1)) contains head(Q1)

(this also gives an algorithm for checking containment, which boils down to finding containment mappings)

conjunctive-ii

Another extension of CQ’s denoted db(Q), obtained by taking the bodies of all the rules together as a db : b.i. preds in the body

Example:

Q1: p(x, y) :- q(x, y), r(u, v) , u <= v

Q2: p(x, y) :- q(x, y) , r(u,v), r(v, u)

Is Q2 contained in/equivalent to Q1?

Q2 is equivalent to the union of

Q2,1: p(x, y) :- q(x, y) , r(u,v), r(v, u), u<= v

Q2,2: p(x, y) :- q(x, y) , r(u,v), r(v, u), v< u

Clearly, Q2,1 and Q2,2 are both contained in Q1

This can be generalized to an algorithm that reduces containment to that of UCQ’s (omitted)

conjunctive-ii

Containment of a UCQ denoted db(Q), obtained by taking the bodies of all the rules together as a db Q and a (recursive) Datalog program P:

Still decidable, but double exponential time(upper & lower bound)

Here also:

Thm:

P contains Q iff P(db(Q)) contains head Q

this gives an algorithm for checking containment:

apply P to db(Q), see if you obtain head(Q)

(do you see exponentials in this algorithm?)

Containment of Datalog programs :

undecidable

conjunctive-ii