Query optimization
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

Query Optimization PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on
  • Presentation posted in: General

Query Optimization. Allison Griffin. Importance of Optimization. Time is money Queries are faster Helps everyone who uses the server Solution to speed lies in the algorithm Different performance improvements with different database engines and schemas. Brief History.

Download Presentation

Query Optimization

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Query optimization

Query Optimization

Allison Griffin


Importance of optimization

Importance of Optimization

  • Time is money

  • Queries are faster

  • Helps everyone who uses the server

  • Solution to speed lies in the algorithm

  • Different performance improvements with different database engines and schemas


Brief history

Brief History

  • Before 1970’s: Dark days, manual optimization

  • Late 70’s to mid 80’s:

    • Birth of relational data model and declarative SQL

    • Optimization is job of system

    • System R-beginning work on join order optimization

    • Dynamic Programming: Heuristic Optimizers

  • Mid 80’s to early 90’s:

    • Extensible query optimization (Exodus)

  • Mid 90’s to late 90’s:

    • Materialized Views


Volcano extensible query optimizer generator

Volcano Extensible Query Optimizer Generator

  • General purpose cost based query optimizer, based on equivalence rules in algebra

    • Equivalences: join associativity, select push down, aggregate push down

    • Extensible: new operations and equivalences can be easily added

    • Developed by Graefe and McKenna 1993


Materialized views

Materialized Views

  • Can materialize (pre-compute and store) views to speed up queries

    • Incremental maintenance

      • when database is updated, propagate updates to materialized view without complete re-computation

    • Deciding when to use materialized views

      • even if query does not refer to materialized view, optimizer can figure out it can be used


Deciding what to materialize

Deciding What to Materialize

  • Maintenance cost and query cost

    • Workload depends on what is materialized:

      • queries and update transactions

      • weights for each component of workload

  • Goal: find set of views that gives minimum cost if materialized, subject to space constraints


What we already know

What we already know…

  • Query optimizer analyzes set of query execution plans and gives optimal (least cost)

    • Heavily dependent on optimizer’s estimate for number of rows that will result at each step of QEP

    • Estimates rely on statistics typically stored in histograms


Recent approaches to improve statistics

Recent Approaches to Improve Statistics

  • Paper “Distinct-Value Synopses for Multiset Operations” by Kevin Beyer, Rainer Gemulla, Peter J. Haas, Berthold Reinwalk, and Yannis Sismanis, 2007

  • IBM’s LEO (Learning Empirical Results in Query Optimization), 2001


Summary of paper results

Summary of Paper Results

  • Addresses the problem of efficient estimate of number of distinct values of an attribute

  • Builds on leveraging of randomized algorithms

  • Claim to have unbiased estimator for distinct values with lower mean squared error

    • Past attempts tend to by higher than the actual number so they have come up with way to cut that number down to be more reasonable


Distinct value estimation

Distinct-Value Estimation

  • Propose summary structure (synopsis) for a relation

    • Synopsis can be used to estimate number of DVs in the partition

    • Synopses can be combined to create synopses for compound partitions created from base partitions using multiset union, intersection or difference operations

    • Updates can be performed on compound partitions by using synopses from base relations


Leo learning emperical results in query optimization

LEO - Learning Emperical Results in Query Optimization

  • Autonomic feedback loops that create a self-tuning database query optimizer

  • Self-validates and adjusts to improve query optimization and execution without requiring user interaction to repair incorrect statistics or cardinality estimates

  • Reduces the total cost of owning database management systems by simplifying database administration


How it works

How it works

  • Monitors queries as they execute

  • Compares the optimizer’s estimates with actuals at each step in a QEP

  • Then computes adjustments to its estimates that may be used during future optimizations of similar queries

  • Moreover, estimation errors can also trigger re-optimization of a query in mid-execution.


Challenges in research of leo

Challenges in Research of LEO

  • (1) ensuring stability and convergence of the autonomic system

  • (2) guaranteeing consistency of the overall optimizer's model upon refinements


Results

Results

  • Reduction of query execution time by orders of magnitude at negligible additional run-time cost

  • Reduced administration time

  • Fewer problem queries

  • Overall improved query performance with increased robustness and predictability of query response times


Bibliography

Bibliography

  • “LEO-Learning Empirical Results in Query Optimization.” IBM. <http://domino.watson.ibm.com/comm/research.nsf/pages/r.datamgmt.innovation.html>.

  • “Optimizing for Query Speed”. SQL. <http://www.devshed.com/c/a/MySQL/Optimizing-for-Query-Speed/1/

  • “Optimizing Database Queries”. IBM. <http://www.stevengould.org/portfolio/developerWorks/efficientPHP/wa-effphp/wa-effphp-4-1.html>.

  • “Optimize Queries Theory in Practice”. <http://www.serverwatch.com/tutorials/article.php/2175621/How-to-Optimize-Queries-Theory-an-Practice.htm>.

  • Beyer, Kevin, Gemulla, Rainer, Haas, Peter J., Reinwald, Berthold, Sismani, Yannis. “Distinct-Value Synopses for Multiset Operations”. Communications of the ACM. Vol. 52. October 2009.

  • Chaudhuri, Surajit. “Technical Perspective: Relational Query Optimization-Data Management Meets Statistical Estimation”. Communications of the ACM. Vol. 52. October 2009.


  • Login