Lecture 25 query optimization
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Lecture 25: Query Optimization PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

Lecture 25: Query Optimization. Wednesday, November 26, 2003. Outline. Cost-based optimization 16.5, 16.6. Cost-based Optimizations. Main idea: apply algebraic laws, until estimated cost is minimal Practically: start from partial plans, introduce operators one by one

Download Presentation

Lecture 25: Query Optimization

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Lecture 25 query optimization

Lecture 25:Query Optimization

Wednesday, November 26, 2003


Outline

Outline

  • Cost-based optimization 16.5, 16.6


Cost based optimizations

Cost-based Optimizations

  • Main idea: apply algebraic laws, until estimated cost is minimal

  • Practically: start from partial plans, introduce operators one by one

    • Will see in a few slides

  • Problem: there are too many ways to apply the laws, hence too many (partial) plans


Cost based optimizations1

Cost-based Optimizations

Approaches:

  • Top-down: the partial plan is a top fragment of the logical plan

  • Bottom up: the partial plan is a bottom fragment of the logical plan


Search strategies

Search Strategies

  • Branch-and-bound:

    • Remember the cheapest complete plan P seen so far and its cost C

    • Stop generating partial plans whose cost is > C

    • If a cheaper complete plan is found, replace P, C

  • Hill climbing:

    • Remember only the cheapest partial plan seen so far

  • Dynamic programming:

    • Remember the all cheapest partial plans


Dynamic programming

Dynamic Programming

Unit of Optimization: select-project-join

  • Push selections down, pull projections up


Join trees

Join Trees

  • R1 ⋈ R2 ⋈ …. ⋈ Rn

  • Join tree:

  • A plan = a join tree

  • A partial plan = a subtree of a join tree

R3

R1

R2

R4


Types of join trees

Types of Join Trees

  • Left deep:

R4

R2

R5

R3

R1


Types of join trees1

Types of Join Trees

  • Bushy:

R3

R2

R4

R5

R1


Types of join trees2

Types of Join Trees

  • Right deep:

R3

R1

R5

R2

R4


Problem

Problem

  • Given: a query R1 ⋈ R2 ⋈ … ⋈ Rn

  • Assume we have a function cost() that gives us the cost of every join tree

  • Find the best join tree for the query


Dynamic programming1

Dynamic Programming

  • Idea: for each subset of {R1, …, Rn}, compute the best plan for that subset

  • In increasing order of set cardinality:

    • Step 1: for {R1}, {R2}, …, {Rn}

    • Step 2: for {R1,R2}, {R1,R3}, …, {Rn-1, Rn}

    • Step n: for {R1, …, Rn}

  • It is a bottom-up strategy

  • A subset of {R1, …, Rn} is also called a subquery


Dynamic programming2

Dynamic Programming

  • For each subquery Q ⊆ {R1, …, Rn} compute the following:

    • Size(Q)

    • A best plan for Q: Plan(Q)

    • The cost of that plan: Cost(Q)


Dynamic programming3

Dynamic Programming

  • Step 1: For each {Ri} do:

    • Size({Ri}) = B(Ri)

    • Plan({Ri}) = Ri

    • Cost({Ri}) = (cost of scanning Ri)


Dynamic programming4

Dynamic Programming

  • Step i: For each Q ⊆ {R1, …, Rn} of cardinality i do:

    • Compute Size(Q) (later…)

    • For every pair of subqueries Q’, Q’’ s.t. Q = Q’  Q’’compute cost(Plan(Q’) ⋈ Plan(Q’’))

    • Cost(Q) = the smallest such cost

    • Plan(Q) = the corresponding plan


Dynamic programming5

Dynamic Programming

  • Return Plan({R1, …, Rn})


Dynamic programming6

Dynamic Programming

To illustrate, we will make the following simplifications:

  • Cost(P1 ⋈ P2) = Cost(P1) + Cost(P2) + size(intermediate result(s))

  • Intermediate results:

    • If P1 = a join, then the size of the intermediate result is size(P1), otherwise the size is 0

    • Similarly for P2

  • Cost of a scan = 0


Dynamic programming7

Dynamic Programming

  • Example:

  • Cost(R5 ⋈ R7) = 0 (no intermediate results)

  • Cost((R2 ⋈ R1) ⋈ R7) = Cost(R2 ⋈ R1) + Cost(R7) + size(R2 ⋈ R1) = size(R2 ⋈ R1)


Dynamic programming8

Dynamic Programming

  • Relations: R, S, T, U

  • Number of tuples: 2000, 5000, 3000, 1000

  • Size estimation: T(A ⋈ B) = 0.01*T(A)*T(B)


Dynamic programming9

Dynamic Programming

  • Summary: computes optimal plans for subqueries:

    • Step 1: {R1}, {R2}, …, {Rn}

    • Step 2: {R1, R2}, {R1, R3}, …, {Rn-1, Rn}

    • Step n: {R1, …, Rn}

  • We used naïve size/cost estimations

  • In practice:

    • more realistic size/cost estimations (next time)

    • heuristics for Reducing the Search Space

      • Restrict to left linear trees

      • Restrict to trees “without cartesian product”

    • need more than just one plan for each subquery:

      • “interesting orders”


Reducing the search space

Reducing the Search Space

  • Left-linear trees v.s. Bushy trees

  • Trees without cartesian product

    Example: R(A,B) ⋈ S(B,C) ⋈ T(C,D)

    Plan: (R(A,B) ⋈ T(C,D)) ⋈ S(B,C) has a cartesian product – most query optimizers will not consider it


Counting the number of join orders

Counting the Number of Join Orders

  • The mathematics we need to know:

    Given x1, x2, ..., xn, how many permutations can we construct ? Answer: n!

  • Example: permutations of x1x2x3x4 are: x1x2x3x4x2x4x3x1.. . . (there are 4! = 4*3*2*1 = 24 permutations)


Counting the number of join orders1

Counting the Number of Join Orders

  • The mathematics we need to know:

    Given the product x0x1x2...xn, in how many ways can we place n pairs of parenthesis around them ? Answer: 1/(n+1)*C2nn = (2n)!/((n+1)*(n!)2)

  • Example: for n=3 ((x0x1)(x2x3))((x0(x1x2))x3)(x0((x1x2)x3)) ((x0x1)x2)x3)(x0(x1(x2x3)))

  • There are 6!/(4*3!*3!) = 5 ways


Counting the number of join orders exercise

Counting the Number of Join Orders (Exercise)

R0(A0,A1) ⋈ R1(A1,A2) ⋈ . . . ⋈ Rn(An,An+1)

  • The number of left linear join trees is:

  • The number of left linear join trees without cartesian products is:

  • The number of bushy join trees is:

  • The number of bushy join trees without cartesian product is:


Number of subplans inspected by dynamic programming

Number of Subplans Inspected by Dynamic Programming

R0(A0,A1) ⋈ R1(A1,A2) ⋈ . . . ⋈ Rn(An,An+1)

  • The number of left linear subplans inspected is:

  • The number of left linear subplans without cartesian products inspected is:

  • The number of bushy join subplans inspected is:

  • The number of bushy join subplans without cartesian product:


Lecture 25 query optimization

Happy Thanksgivings !


  • Login