Query Optimization
This presentation is the property of its rightful owner.
Sponsored Links
1 / 10

Query Optimization Advanced Databases By Dr. Akhtar Ali PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

Query Optimization Advanced Databases By Dr. Akhtar Ali. Overview of Query Optimization. Logical Optimization: Using some transformation rules and algebraic equivalences To choose different join orders R ⋈ S  S ⋈ R (Commutative) (R ⋈ S) ⋈ T  R ⋈ ( S ⋈ T) (Associative)

Download Presentation

Query Optimization Advanced Databases By Dr. Akhtar Ali

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Query optimization advanced databases by dr akhtar ali

Query Optimization

Advanced Databases

By

Dr. Akhtar Ali


Overview of query optimization

Overview of Query Optimization

  • Logical Optimization: Using some transformation rules and algebraic equivalences

    • To choose different join orders

      • R ⋈S  S ⋈R (Commutative)

      • (R ⋈S) ⋈T  R ⋈ (S ⋈T) (Associative)

    • To push selections and projections bellow of joins

      • (attr1 op value (R) ) ⋈S  attr1 op value(R ⋈S )

      • attr1, attr2, attr3 ((attr1, attr2 (R) ) ⋈S )  attr1, attr2,attr3 (R ⋈S )

  • Physical Optimization: For a given R.A expression

    • There could be several different plans possible using different implementation of the R.A operators

    • Calculate the cost of these different plans and choose the best

    • Ideally, we want the best plan, but

    • Practically, we should avoid worst plans


Pushing selection projection below join

Pushing Selection & Projection below Join

  • A join is quite expensive operation.

    • The cost can be reduced by reducing the sizes of the input relations.

  • The sizes of the Input Relations can be reduced by applying:

    • Selection: restricting the input relation.

    • Projection: reducing the number of columns.

  • Usually Selection reduces the size of the input relation more than Projection.

  • Projection before the Join should be done quite carefully as the cost of Projection could increase the overall cost if it does not reduce by a good factor the size of the input.


Estimating the evaluation cost of a plan

Estimating the Evaluation Cost of a Plan

  • For each plan, we should be able to estimate the overall cost.

  • For each node in the query tree, we estimate the cost of performing the corresponding operation;

  • For each node in the query tree, we estimate the size of the result, which is used by the operation in the parent node;

  • In order to correctly estimate the cost of each operation and the size of its result, the optimizer uses certain statistical information maintained by the DBMS.


Statistics maintained by a dbms

Statistics Maintained by a DBMS

  • Cardinality

    • The number of tuples for each relation.

  • Size

    • The number of pages for each relation.

  • Index Cardinality

    • The number of distinct key values for each index.

  • Index Size

    • The number of pages for each index.

  • Index Height

    • The number of non-leaf levels for each tree index.

  • Index Range

    • The minimum present key value and the maximum present key value for each index.


A motivating example

A motivating example

SELECT S.sname

FROM Reserves R, Sailors S

Where R.sid = S.sid

AND R.bid = 100

AND S.rating > 5

  • Cost = 500 + 500 * 1000 I/Os (Using nested loops – page-oriented)

    • 500,500 I/Os

    • This is not an efficient plan.

  • We could have pushed selections down before join, no index is used.

  • Goal of Optimization:

    • To find more efficient plans that compute the same answer.


Optimization alternative 1 no index

Optimization: Alternative 1 (no index)

  • Push Selects before join

  • Assuming that there are 1000 tuples in Reserves with bid=100 and 20000 tuples in Sailors with rating > 5. So Cost of Selections:

    • Scan Reserves (1000 pages) and write the selected 1000 tuples to temp relation T1 (10 pages), so in total 1010 I/Os

    • Scan Sailors (500 pages) and write the selected 20000 tuples to temp relation T2 (250 pages), so in total 750 I/Os

  • Total cost = 1010 + 750 = 1760 I/Os so far

  • Using SNL the cost = 10 + 10 * 250 = 2510 I/Os

  • Total cost = 1760 + 2510 = 4270 I/Os about 117 times less than the cost of the initial plan i.e. 500,500 I/Os


  • Optimization alternative 2 uses indexes

    Optimization: Alternative 2 (uses indexes)

    • The same RA as alternative 1

    • Using a clustered Hash index on Reserves:

      • A hash index will take 1 plus 10 to retrieve the 1000 qualifying tuples

      • Cost of Selection = 1 + 10 = 11 I/Os

      • Cost of Writing to T1 = 10 I/Os

      • Sub-Total = 11 +10 = 21

      • The size of T1 = 10 pages

    • Using a clustered B+ tree index on Sailors:

      • Cost of Selection = 2 (the constant cost) + 250 = 252 I/Os, Cost of Writing to T2 = 250 I/Os, Sub-Total = 252 + 250 = 502, and the size of T2 = 250 pages

    • Using SNL the cost is:

      • Cost = 10 + 10 * 250 = 2510 I/Os

    • Total cost = 21 + 502 + 2510 = 3033 I/Os, which is 165 times less than the cost of the initial plan i.e. 500,500 I/Os


    Optimization alternative 3 using pipelining i e intermediate results are not written to disk

    Optimization: Alternative 3(using pipelining i.e. intermediate results are not written to disk)

    • The same RA as alternative 1

    • Using a clustered Hash index on Reserves:

      • A hash index will take 1 plus 10 to retrieve the 1000 qualifying tuples

      • Cost of Selection = 1 + 10 = 11 I/Os

      • The size of T1 = 10 pages

    • Using a clustered B+ tree index on Sailors:

      • Cost of Selection = 2 (the constant cost) + 250 = 252 I/Os

      • The size of T2 = 250 pages

    • Using Block-Nested Loops join with 7 buffer pages:

      • Cost =

    • Total cost = 11 + 252 + 510 = 773 I/Os, which is 647 times faster than the initial plan and about 4 times faster than the previous one.


    Summary

    Summary

    • Query optimization is very important component of a DBMS.

    • We must understand the principals of optimization to know how it influences the performance of a database system.

    • Two ways of optimization:

      • Logical: Push projection/selection below the join

      • Physical: Using indexes and better implementation of relational operators.


  • Login