1 / 34

An Overview of Query Optimization

An Overview of Query Optimization. Group Members: Shamila Amarasekera Kwame Peprah Betsy Tremaine Course: CS 485. History of Query Optimization. Databases were widely used by the early 1980’s Difficulty in using queries effectively

alexavier
Download Presentation

An Overview of Query Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Overview of Query Optimization Group Members: Shamila Amarasekera Kwame Peprah Betsy Tremaine Course: CS 485

  2. History of Query Optimization • Databases were widely used by the early 1980’s • Difficulty in using queries effectively • Search for consistent queries led to first experiments with query optimization

  3. Query Processing in the DBMS • Four separate phases • Parsing • Optimization • Code Generation • Execution

  4. Process of Query Evaluation in a DBMS

  5. What is Query Optimization? Definition: • The process of selecting the most efficient selection plan from among many possible strategies.

  6. Issues Of Query Optimization • What plans to consider • How the cost of a plan is estimated Goal: We want to find the best plan possible

  7. Techniques Algorithm for evaluating relational operator: • Indexing • Iteration • Partitioning

  8. Contd... System Catalogs Provides information about the indexes and relations Of a given query. Catalog mostly consists of: • # distinct key value for each index • # of tuples for each relation Catalogs must be updated periodically

  9. Qualities of a good Query Optimizer • Search space must be able to include plans that have a low cost. • The costing technique must be accurate. • The enumeration algorithm that searches through the execution space must be efficient.

  10. What is System R? • First major relational system • First test for query optimization • Designed to show that effective query selection could be automated

  11. How System R Worked • Cost estimation • Estimated time to execute possible queries • Based on I/O access and projected CPU time • Maintained current list of table dependencies

  12. Query Optimization in System R • Produced left-deep plans • Only binary joints are considered • For a join of k relations, optimizer only considered plans that join kth relation with a subset of k-1 • Result: left-to-right-associative statements in relational algebra, where only one relationship at a time is considered Source: Chu et al. Least expected cost query optimization: an exercise in utility. From Symposium on Principles of Database Systems, May 1999.

  13. Features of System R • RSS: Research Storage System • RDS: Relational Data System • Goal: to select the query that uses the fewest possible I/O and RSSaccesses.

  14. How Does the Optimizer Select A Table? • Selectivity: Ratio of key values to items in table • Clustering: Order of items in physical storage • Taken together, these indicate how many page fetches and how muchbranching the query needs

  15. Joining Tables 2 options: • Scan first table for values, join with second based on the second's index • Sort rows of both tables by join fields, then search for matching values • Depends on whether or not the tables have indices

  16. Drawbacks of System R • Optimized individual statements: created a large amount of processing overheads. • Individual systems that used System R could hide too many details of the underlying database, making difficult work for database programmers.

  17. Query Optimization in other systems Although the relational model is still the most widely-used model for databases, other types of databases exist for both research and business purposes. Some to mention: • Deductive DBS • Object Oriented DBS • Multiple and distributed DBS

  18. Deductive Databases • Similar to relational databases • Query language has additional capability • Can express recursive queries • Use logic programming to support complex reasoning • Integrate rules and facts of traditional DBS • Used in supporting advanced applications in the areas of: • Computer aided design • Manufacturing • Office systems

  19. Deductive Databases, Glue-Nail • Ex: Glue-Nail DB have two complementary languages: • Declarative and procedural • Glue: Procedural language having query capabilities of • Updating, aggregation, input-output libraries, and procedural control • Nail: logic-based query language and uses • Function symbols to represent complex objects • Express powerful recursive queries • IGlue is a lower level language compiled in Glue-Nail • The query optimization occurs in evaluation of the IGlue code

  20. Deductive Databases, Glue-Nail Glue-Nail architecture Glue-Nail Modules Glue Compiler Nail Nail Compiler IGlue IGlue Static Optimizer IGlue Linker IGlue IGlue Interpreter Source: M. A. Derr. Adaptive Query Optimization in a Deductive Database Systems. EDB

  21. Deductive Databases, Glue-Nail • Optimizers used in Glue-Nail system: • Runtime query optimizer: • Improves system performance by reoptimizing query evaluation plan automatically • Creates and drops indexes automatically • IGlue adaptive optimizer: • Effective alternative reoptimization • Has a large performance gain

  22. Object-Oriented Databases (OODB) • OODB combine the data abstraction and computation models of object-oriented programming languages • C++/Java • Contain the performance and consistency features of databases • Developed to support powerful application domains • Engineering databases • Office information systems

  23. Object-Oriented Databases (OODB) • Currently much attention is given experimentally and theoretically • Some of the many questions to be answered include • lack of a common data model • lack of formal foundations • strong experimental activity

  24. Object-Oriented Databases person class person{ string name; string address; string get-name(); string getAddress(); int set-address(string new-address); }; class customer isa person{ int creditRating; }; class employee isa person { ……. }; class officer isa employee { ……… }; class teller isa employee { ……… }; class secretary isa employee { ……… }; person customer employee IS A officer teller secretary employee customer IS A officer teller secretary Specialization hierarchy Class hierarchy Definition of class hierarchy in pseudo code Source: Class text book

  25. Object-Oriented Databases person customer employee temporary permanent officer teller secretary permanent-officer permanent-teller permanent-secretary temporary-teller temporary-secretary Class hierarchy Multiple inheritance usually happen Source: Class text book

  26. Object-Oriented Databases (OODB) • Implementation of query optimization is a delicate process • Need to find an execution plan that minimizes the cost • Logical and physical levels are traditional in optimization • Logical: semantic properties (query writing) • Physical: find the best algorithm using the cost model

  27. Multiple and Distributed Databases • Multidatabase system (MDBS): An additional software layer on a heterogeneous database system that allows access and manipulation of information. • Distributed database systems (DDBS): A collection of logically-interrelated databases distributed over a computer network. • Multidatabase system (MDBS): A distributed system that acts as a front end to multiple local database management systems (DBMS).

  28. Distributed DatabaseUser view network Distributed Database Communication via network • The global system provides full functionality and interacts with local DBMS having an external use interface. • Since the queries involve more than one database, to have an efficient system performance, query optimization should be global

  29. Multiple and Distributed Databases • Several levels of (request and data) transactions are needed, • different local capabilities are assumed, • more dynamic query optimization techniques are needed, • local query optimization information may not be known at the global level, • response time for local request may not be predictable, • more constraints about where data can be transferred to and where an operation can be performed, need to be considered during global query optimization.

  30. Multiple and Distributed Databases • Many distributed query optimization tools are not suitable for MDBS. • Two-phase optimization approach, along with several global query optimization techniques, are used in MDBS. • Global query optimizations: • semantic, parametic, and adaptive query optimization.

  31. Other Query Optimization • Memory is an issue when determining execution plans. • Databases involve multimedia, and web uses Optimizers in fuzzy queries. • Fuzzy queries are flexible, easy-to-use design • Have information beyond the algebra and relational operators.

  32. Other Query Optimization Issues Contd.. • SQL extensions for decision support systems are widely used. • Query optimizers are used in decision support systems • uses cube as an extended language. • Cube/data cube: operator that generalizes the histograms, cross-tabulation, roll-up, drill-down, and sub-total constructs found in most report writers.

  33. Conclusion Query optimization • More than transformation and query equivalence. • Database designers must develop a robust cost metric, • but consider cost and search space. • Building extensible enumeration architecture is important • But very complex. • New advancements have improved • But brought problems • Understanding the existing engineering framework make effective query optimizations.

  34. Thank you Questions ?

More Related