on answering queries in the presence of limited access patterns
Download
Skip this Video
Download Presentation
On Answering Queries in the Presence of Limited Access Patterns

Loading in 2 Seconds...

play fullscreen
1 / 27

On Answering Queries in the Presence of Limited Access Patterns - PowerPoint PPT Presentation


  • 232 Views
  • Uploaded on

On Answering Queries in the Presence of Limited Access Patterns. Chen Li Stanford University joint work with Edward Chang, UC Santa Barbara. Harrison Ford. Air Force One. On Golden Pond. Oscar, Best Actor. Henry Fonda. On Golden Pond. On Golden Pond. Oscar, Best Actress. Kevin Spacey.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'On Answering Queries in the Presence of Limited Access Patterns' - Sharon_Dale


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
on answering queries in the presence of limited access patterns

On Answering Queries in the Presence of Limited Access Patterns

Chen Li

Stanford University

joint work with Edward Chang, UC Santa Barbara

ICDT'2001, London, UK

slide2
Harrison Ford

Air Force One

On Golden Pond

Oscar, Best Actor

Henry Fonda

On Golden Pond

On Golden Pond

Oscar, Best Actress

Kevin Spacey

American Beauty

American Beauty

Oscar, Best Picture

A movie database

r(Star, Movie)

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

s(Movie, Award)

slide3
Harrison Ford

Air Force One

On Golden Pond

Oscar, Best Actor

Henry Fonda

On Golden Pond

On Golden Pond

Oscar, Best Actress

Kevin Spacey

American Beauty

American Beauty

Oscar, Best Picture

Limited access patterns

r(Star, Movie)

Should provide a star.

Should provide a movie.

s(Movie, Award)

slide4
Harrison Ford

Air Force One

On Golden Pond

Oscar, Best Actor

Henry Fonda

On Golden Pond

On Golden Pond

Oscar, Best Actress

Kevin Spacey

American Beauty

American Beauty

Oscar, Best Picture

Answering Q given the restrictions

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

r(Star, Movie)

s(Movie, Award)

slide5
Harrison Ford

Air Force One

On Golden Pond

Oscar, Best Actor

Henry Fonda

On Golden Pond

On Golden Pond

Oscar, Best Actress

Kevin Spacey

American Beauty

American Beauty

Oscar, Best Picture

The answer is complete

  • We did not retrieve all the tuples from the relations.
  • Still we computed all tuples in the answer to the query.

r(Star, Movie)

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

s(Movie, Award)

slide6
Harrison Ford

Air Force One

On Golden Pond

Oscar, Best Actor

Henry Fonda

On Golden Pond

On Golden Pond

Oscar, Best Actress

Kevin Spacey

American Beauty

American Beauty

Oscar, Best Picture

Change the restriction

  • We cannot compute the complete answer to Q.
  • There can always be some tuples that are not retrievable.

r(Star, Movie)

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

s(Movie, Award)

general questions
General questions
  • Given a query on relations with limited access patterns, can we compute its complete answer by accessing the relations with legal patterns?
    • Stable queries
  • Different classes of queries
  • Another problem studied: testing query containment in the presence of binding patterns.
rest of the talk
Rest of the talk
  • Binding patterns, query stability
  • Testing stability of queries:
    • Conjunctive queries
    • Unions of conjunctive queries
    • Conjunctive queries with arithmetic comparisons
    • Datalog queries
  • Dynamic computability of complete answer to conjunctive queries
  • Conclusion and related work
i binding patterns
(I) Binding patterns
  • Attributes with adornments:
    • b: bound
    • f: free
  • Example:

r(Starb, Movief), s(Movieb, Awardf)

  • A relation can have multiple binding patterns.
slide10
Reasons of the restrictions:
    • Web search forms
    • Legacy databases
    • Security concerns
  • Observations:

If a relation does not have an “all-free” binding pattern, then after certain queries are sent to this relation, there can always be some tuples that have not been retrieved.

query stability
Query stability
  • A query Q on relations with binding patterns is stable if for any database, we can compute Q’s complete answer by accessing the relations with legal patterns.
  • The complete answer is the computable answer if we could retrieve all the tuples from the relations.
  • Use partial tuples to derive the complete answer: we need reasoning.
assumptions about bindings
Assumptions about bindings
  • Use values from Q and results from the relations as bindings:
    • The definition says “for any database”
    • Relations not in the query can be assumed to be empty
  • Not allowed: try arbitrary strings as bindings to access the relations
    • Does not terminate
    • Impractical
ii testing stability of queries
(II) Testing stability of queries

Conjunctive query:

q(X) :- g1(X1),…,gn(Xn)

  • Feasible order of some subgoals of a CQ Q.
    • Each subgoal in the order is executable
    • That is, we have enough bound variables to satisfy one binding pattern of the relation
  • Example:

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

feasible cqs
Feasible CQs
  • A CQ is feasible if it has a feasible order of all its subgoals.
  • Lemma: A feasible CQ is stable.
  • Testing feasibility of a CQ
    • A greedy algorithm: Inflationary
what if q is not feasible
What if Q is not feasible?

Q’(Award) :- r(henry fonda,Movie),

s(Movie,Award),r(Star,Movie)

  • Not feasible: variable Star cannot be bound
  • Equivalent to the old query:

Q(Award) :- r(henry fonda,Movie),

s(Movie,Award)

  • The new query Q’is stable!
testing stability of a cq
Testing stability of a CQ

Theorem:

A CQ Q is stable iff its minimal equivalentQm is feasible.

  • Minimal equivalent query Qm
  • Qm is unique
main idea of the proof
Database D1

Database D2

Main idea of the proof
  • Construct two databases of the relations
  • They have the same observable tuples, but yield different answers to the query
  • Thus, we cannot tell whether the computed answer is complete or not

Same

observable

tuples

Different answers

to Q

two algorithms for cqs
Two algorithms for CQs
  • Algorithm CQStable
    • MinimizeQ, get its minimal equivalent Qm
    • Test feasibility of Qm by calling Inflationary
  • Algorithm CQStable*
    • Compute all executable subgoals of Q
    • If all subgoals become executable, then Q is stable
    • Otherwise, test equivalence between Q and the new query with the executable subgoals
  • CQStable* is more efficient thanCQStable
  • Testing stability of a CQ is NP-complete.
other classes of queries
Other classes of queries
  • Unions of CQs: two algorithms
  • CQs with arithmetic comparisons:
    • An algorithm for the testing stability
  • Datalog queries:
    • Undecidable
    • Give a sufficient condition for stability of Datalog
iii dynamic computability of complete answer to cqs
(III) Dynamic computability of complete answer to CQs

For a nonstable CQ Q, for certain database, Q’s complete answer might be computed.

an example
An example

Q1: ans(B) :- r(a,B,C),s(C,D)

  • Not stable
  • For the following database, we can still compute Q1’s complete answer: {b1,b2}.

r(Ab, Bf, Cf)

s(Cf, Db)

p(Df)

a

b1

c1

c1

d1

d1

a

b2

c2

c2

d2

d2

a

b2

c3

change the head argument
Change the head argument

Q2: ans(D) :- r(a,B,C),s(C,D)

  • Still not stable
  • For the database, we cannot compute Q2’s complete answer.

r(Ab, Bf, Cf)

s(Cf, Db)

p(Df)

a

b1

c1

c1

d1

d1

a

b2

c2

c2

d2

d2

a

b2

c3

difference between q1 and q2
Difference between Q1 and Q2

b f f f b

Q1: ans(B) :- r(a,B,C),s(C,D)

Q2: ans(D) :- r(a,B,C),s(C,D)

  • Q1’s head argument B is bound by the executable subgoal r(a,B,C).
  • Q2’s head argument D is not bound by the executable subgoal r(a,B,C).
generalization
Generalization

q(X) :- g1(X1), …, gk(Xk),

gk+1(Xk+1), …, gn(Xn)

  • Executable subgoals: E = g1(X1),…, gk(Xk)
  • If all arguments in X are bound in E:
    • we might compute its complete answer.
    • The computability is database dependent.
  • If some arguments in X are not bound in E:
    • we can never compute its complete answer.
    • Unless the relation after the subgoals in Eis empty.
a decision tree
A decision tree
  • It guides the planning process of computing the complete answer to a query.
  • Two approaches while traversing the tree:
    • optimistic
    • pessimistic
conclusion
Conclusion
  • Stability of queries with binding patterns
  • Various classes of queries:
    • CQs (two algorithms)
    • Unions of CQs (two algorithms)
    • CQs with arithmetic comparisons (one algorithm)
    • Datalog (undecidable)
  • Dynamic computability of a CQ’s complete answer
  • Another contribution: decidability result of testing relative query containment with binding restrictions
related work
Related work
  • Answering queries using views with binding patterns [RSU95]
  • Query optimization [YLUGM99,FLMS99]
  • Computing maximal answer to queries [DL97,LC00]

Our work considers whether the complete answer to a query is computable.

ad