Query optimization
1 / 16

Query Optimization - PowerPoint PPT Presentation

  • Uploaded on

Query Optimization. Allison Griffin. Importance of Optimization. Time is money Queries are faster Helps everyone who uses the server Solution to speed lies in the algorithm Different performance improvements with different database engines and schemas. Brief History.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Query Optimization' - pisces

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Query optimization

Query Optimization

Allison Griffin

Importance of optimization
Importance of Optimization

  • Time is money

  • Queries are faster

  • Helps everyone who uses the server

  • Solution to speed lies in the algorithm

  • Different performance improvements with different database engines and schemas

Brief history
Brief History

  • Before 1970’s: Dark days, manual optimization

  • Late 70’s to mid 80’s:

    • Birth of relational data model and declarative SQL

    • Optimization is job of system

    • System R-beginning work on join order optimization

    • Dynamic Programming: Heuristic Optimizers

  • Mid 80’s to early 90’s:

    • Extensible query optimization (Exodus)

  • Mid 90’s to late 90’s:

    • Materialized Views

Volcano extensible query optimizer generator
Volcano Extensible Query Optimizer Generator

  • General purpose cost based query optimizer, based on equivalence rules in algebra

    • Equivalences: join associativity, select push down, aggregate push down

    • Extensible: new operations and equivalences can be easily added

    • Developed by Graefe and McKenna 1993

Materialized views
Materialized Views

  • Can materialize (pre-compute and store) views to speed up queries

    • Incremental maintenance

      • when database is updated, propagate updates to materialized view without complete re-computation

    • Deciding when to use materialized views

      • even if query does not refer to materialized view, optimizer can figure out it can be used

Deciding what to materialize
Deciding What to Materialize

  • Maintenance cost and query cost

    • Workload depends on what is materialized:

      • queries and update transactions

      • weights for each component of workload

  • Goal: find set of views that gives minimum cost if materialized, subject to space constraints

What we already know
What we already know…

  • Query optimizer analyzes set of query execution plans and gives optimal (least cost)

    • Heavily dependent on optimizer’s estimate for number of rows that will result at each step of QEP

    • Estimates rely on statistics typically stored in histograms

Recent approaches to improve statistics
Recent Approaches to Improve Statistics

  • Paper “Distinct-Value Synopses for Multiset Operations” by Kevin Beyer, Rainer Gemulla, Peter J. Haas, Berthold Reinwalk, and Yannis Sismanis, 2007

  • IBM’s LEO (Learning Empirical Results in Query Optimization), 2001

Summary of paper results
Summary of Paper Results

  • Addresses the problem of efficient estimate of number of distinct values of an attribute

  • Builds on leveraging of randomized algorithms

  • Claim to have unbiased estimator for distinct values with lower mean squared error

    • Past attempts tend to by higher than the actual number so they have come up with way to cut that number down to be more reasonable

Distinct value estimation
Distinct-Value Estimation

  • Propose summary structure (synopsis) for a relation

    • Synopsis can be used to estimate number of DVs in the partition

    • Synopses can be combined to create synopses for compound partitions created from base partitions using multiset union, intersection or difference operations

    • Updates can be performed on compound partitions by using synopses from base relations

Leo learning emperical results in query optimization
LEO - Learning Emperical Results in Query Optimization

  • Autonomic feedback loops that create a self-tuning database query optimizer

  • Self-validates and adjusts to improve query optimization and execution without requiring user interaction to repair incorrect statistics or cardinality estimates

  • Reduces the total cost of owning database management systems by simplifying database administration

How it works
How it works

  • Monitors queries as they execute

  • Compares the optimizer’s estimates with actuals at each step in a QEP

  • Then computes adjustments to its estimates that may be used during future optimizations of similar queries

  • Moreover, estimation errors can also trigger re-optimization of a query in mid-execution.

Challenges in research of leo
Challenges in Research of LEO

  • (1) ensuring stability and convergence of the autonomic system

  • (2) guaranteeing consistency of the overall optimizer's model upon refinements


  • Reduction of query execution time by orders of magnitude at negligible additional run-time cost

  • Reduced administration time

  • Fewer problem queries

  • Overall improved query performance with increased robustness and predictability of query response times


  • “LEO-Learning Empirical Results in Query Optimization.” IBM. <http://domino.watson.ibm.com/comm/research.nsf/pages/r.datamgmt.innovation.html>.

  • “Optimizing for Query Speed”. SQL. <http://www.devshed.com/c/a/MySQL/Optimizing-for-Query-Speed/1/

  • “Optimizing Database Queries”. IBM. <http://www.stevengould.org/portfolio/developerWorks/efficientPHP/wa-effphp/wa-effphp-4-1.html>.

  • “Optimize Queries Theory in Practice”. <http://www.serverwatch.com/tutorials/article.php/2175621/How-to-Optimize-Queries-Theory-an-Practice.htm>.

  • Beyer, Kevin, Gemulla, Rainer, Haas, Peter J., Reinwald, Berthold, Sismani, Yannis. “Distinct-Value Synopses for Multiset Operations”. Communications of the ACM. Vol. 52. October 2009.

  • Chaudhuri, Surajit. “Technical Perspective: Relational Query Optimization-Data Management Meets Statistical Estimation”. Communications of the ACM. Vol. 52. October 2009.