1 / 44

An Insiders View of How the Optimizer Works

An Insiders View of How the Optimizer Works. Tang Tao Oracle University Principal Instructor .

yestin
Download Presentation

An Insiders View of How the Optimizer Works

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Insiders View of How the Optimizer Works Tang Tao Oracle University Principal Instructor

  2. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

  3. Optimizer evolution CBO Databases became more feature rich As environment changes Potential for plan changes RULES Rule are not enough • In the beginningthere were rules • Optimizer evolved to • be cost based Databases become more real-time, ad-hoc environments CBO Reactive tuning with the use of advisors and auto jobs Reactive tuning not enough • Optimizer proactively adapts to become self-learning

  4. Adaptive Query Optimization Overview • Adaptive Query Optimization Adaptive PlansJoin methods Parallel distribution methods Adaptive StatisticsDiscovered at compile time Discovered at run time Adaptive Plans Adaptive Statistics Parallel distribution Methods Join Methods At compile time At run time

  5. Adaptive Execution Plans Query: Find all of the products with a unit price of 15 that we have sold more that 1 of Two possible join methods for this query Nested Loops Hash Join Adapt join methods

  6. Adaptive Execution Plans Rows coming out of order_items table are buffered up to a point If row count is less than the threshold use nested Loops otherwise switch to hash join Alternative sub-plans are pre-computed Sub-plans stored in the cursor Stats collect inserted before join Rows buffered until final decision is made Adapt join methods Table scan Prod_info NESTED LOOPS HASH JOIN Default Plan is a nested loops join Table scan Order _items Index Scan Prod_info_ind Statistics Collector

  7. Adaptive Execution Plans Statistics collector disabled after decision is made and becomes a pass through operation Number of rows seen in statistics collector exceeds threshold Plan switches to hash join Statistics collect disabled Plan resolved on first execution & remains the same for subsequent executions Adapt join methods Statistics Collector Table scan Prod_info NESTED LOOPS HASH JOIN Final Plan is a hash join Table scan Order _items Index Scan Prod_info_ind

  8. Adaptive Execution Plans Explain plan command always shows default plan Example shows a nested loops join as default plan No statistics collector shown in plan Displaying the default plan

  9. Adaptive Execution Plans After the statement has completed use DBMS_XPLAN.DISPLAY_CURSOR tosee the final plan selected Example shows that hash join picked at execution time Again the statistics collector is not visible in the plan Displaying the final plan

  10. Adaptive Execution Plans Full adaptive plan displayed when format parameter ‘+adaptive’ is set Example shows both the nested loops and hash join in the plan Displaying the full adaptive plan

  11. Adaptive Execution Plans Additional information displayed on why operations are inactive can be seen with format parameter ‘+report’ Displaying plan with +adaptive & +report formats

  12. Adaptive Execution Plans New column in V$SQL IS_RESOLVED_ADAPTIVE_PLAN Indicates statement had an adaptive plan which was resolved on first execution Resolved plan is used for subsequent executions Statistics collectors and buffering is disabled Indicator in V$SQL

  13. Adaptive Execution Plans Adaptive plans are enabled by default Can be put in reporting mode OPTIMIZER_ADAPTIVE_REPORTING_ONLY Reporting mode shows what would have happened during execution in the plan Reporting mode

  14. Adaptive Execution Plans Adaptive join methods • Join method decision deferred until runtime • Default plan is computed using available statistics • Alternate sub-plans are pre-computed and stored in the cursor • Statistic collectors are inserted at key points in the plan • Final decision is based on statistics collected during execution • Default plan and sub-plans have a valid range for statistics collected • If statistics prove to be out of range, the sub-plans will be swapped • Requires buffering near the swap point to avoid returning rows to user • Possible sub-plans are nested loop joins or hash joins and vice versa

  15. When user issues a parallel SQL statement theshadowprocess becomes the Query Coordinator User Parallel servers communicate among themselves & the QC using messages that are passed via memory buffers in the shared pool How the data is distributed among the processes is based on the operation being done & number of rows expected QC gets parallel servers from global pool and distributes the work to them Parallel servers - individual sessions that perform work in parallel Adaptive Execution Plans Adaptive parallel data distribution Shadowprocess is spawned User connects to the database

  16. Adaptive Distribution Methods Hybrid-HASH Join • Cardinality based distribution skew common scenario • Crucial for parallel join of very small data sets with very large data sets • Distribution method decision based on expected number of rows • New adaptive distribution method HYBRID-HASH • Statistic collectors inserted in front of PX process on the left hand side of the join • If actual number of rows less than threshold, switch from HASH to Broadcast • Threshold number of total rows < 2x DOP • Enabled by default

  17. Adaptive Execution Plans # rows returned less than threshold so rows distributed via Broadcast Querycoordinator Hybrid hash join between EMP and DEPT Distribution method based on runtime stats Statistics collector inserted in front of PX processes scanning DEPT Adaptive parallel data distribution Statistics Collector threshold 2X DOP P1 P2 P3 P4 DEPT EMP P5 P6 P7 P8

  18. Adaptive Distribution Methods Broadcast/Round Robin Hybrid hash join between EMP and DEPT Distribution method based on runtime stats If DEPT uses Broadcast- EMP uses Round-Robin DOP used is 6 Number of rows returned is 4 Broadcast used because 24 rows distributed (6 X 4)

  19. Adaptive Query Optimization • Adaptive Query Optimization Adaptive PlansJoin methods Parallel distribution methods Adaptive StatisticsDiscovered at compile time Discovered at run time Overview Adaptive Plans Adaptive Statistics Parallel distribution Methods Join Methods At compile time At run time

  20. Dynamic Statistics • During compilation optimizer decides if statistics are sufficient to generate a good plan or not • Dynamic statistics are used to compensate for missing, stale, or incomplete statistics • They can be used for table scans, index access, joins and group bys • One type of dynamic statistics is dynamic sampling

  21. Dynamic Statistics Dynamic sampling has a new level 11(AUTO) Decision to use dynamic sampling depends on the complexity of predicate, existing statistics and total execution time Dynamic statistics shared among queries Dynamic Sampling

  22. Dynamic Statistics SGA SHARED DYNAMIC STATISTICS Select * Fromproduct_information Wherelist_price-min_price=29 And category_id not in (11,22) And prod_name like ‘Smart%’; Table cardinality estimate for prod_info 5 Optimizer determines plan using dynamic statistics 2 SHARED_POOL Cursor 0: Select * from product_information … SQL statement is submitted 1 4 Dynamic sampling occurs on small number of blocks from table Resulting dynamic statistics are stored in cache Optimizer checks for existing statistics in data dictionary DATA DICTIONARY 3 Statistics found but need to be augmented due to complex predicates

  23. Dynamic Statistics SGA SHARED DYNAMIC STATISTICS • Select supplier_id, prod_name Fromproduct_information Wherelist_price-min_price=29 And category_id not in (11,22) And prod_name like ‘Smart%’; Table cardinality estimate for prod_info 10 Optimizer determines plan using dynamic statistics 7 SHARED_POOL Cursor 0: Select supplier_id, prod_name … Different SQL statement is submitted with same predicates 6 9 Necessary Dynamic statistics found in shared cache Optimizer checks for existing statistics in data dictionary DATA DICTIONARY 8 Statistics found but need to be augmented due to complex predicates

  24. Adaptive Statistics Re-optimization • During execution optimizer estimates are compared to execution statistics • If statistics vary significantly then a new plan will be chosen for subsequent executions based on execution statistics • Re-optimization uses statistics gathered from previous executions • First introduced as Cardinality Feedback in 11.2

  25. Adaptive Statistics Initial execution of a query is monitored Example shows initial plan is hash join between sales and customers Cardinality estimates 8X off Cardinality Feedback pre 12c Initial Cardinality estimates are more than 8X off

  26. Adaptive Statistics Execution Plan after the second execution Execution statistics used to reparse the statement on the second execution New plan shows correct cardinality estimates and a new join order Information learnt is stored in the cursor only and is lost if cursor ages out Cardinality Feedback pre 12c Estimates are now correct

  27. Adaptive Statistics Re-optimization in 12c • Join statistics are also monitored • Works with adaptive cursor sharing for statement with binds • New Column in V$SQL IS_REOPTIMIZABLE • Information found at execution time is persisted as SQL Plan Directives

  28. Adaptive Statistics New column in V$SQL IS_REOPTIMIZABLE Indicates that the statement will be re-parsed on the next execution Re-optimization – indicator in V$SQL

  29. Adaptive Statistics Adapt join method example showed the join change from nested loops to hash join But cursor is also marked IS_REOPTIMIZABLE Why? Re-optimization will occur on second execution because cardinality estimates off Adaptive Plans & Re-optimization working together Initial Cardinality estimates are off

  30. Adaptive Statistics On second execution statement is re-parsed using execution statistics New child cursor created with new plan New plan changes the join order Adaptive Plans & Re-optimization working together

  31. Adaptive Statistics SQL Plan Directives • Directives are additional information used during optimization to generate a better plan • For example, when table T1 is joined to T2 use dynamic statistics to get accurate cardinality estimate • Directives are collected on query expressions not at a statement level • Allows for directives to be used for multiple statements • Persisted on disk in the SYSAUX tablespace • Directives will be automatically maintained • Managed using the new package DBMS_SPD

  32. SQL Plan Directives SGA DIRECTIVE CACHE Directive: Use DS for customers table when column city, country are used Select * FROM customers where state =‘CA’ AND country=‘USA’; 2 SHARED_POOL IS_REOPTIMIZABLE = Y SQL statement is submitted 1 During execution cardinality estimate discovered to be wrong and directive created 3 Optimizer determines plan Cursor 0: Select * from customers …….. SysauxTablespace

  33. SQL Plan Directives SGA DIRECTIVE CACHE Directive: Use DS for customer table when column city, country are used Select * FROM customers where state=‘CA’ AND country=‘USA’; 5 Optimizer determines new plan using execution statistics SHARED_POOL Cursor 0: Select * from customers ….. IS_REOPTIMIZABLE = Y Cursor 1: Select * from customers ….. 4 Same SQL statement is submitted again 6 This execution and subsequent execution use child cursor 1 IS_REOPTIMIZABLE = N Directives periodically persisted to disk in the sysauxtablespace SQL Directives SQL Directives SysauxTablespace Sqlid Binds Stats Extension DIRECTIVE CACHE

  34. SQL Plan Directives SGA DIRECTIVE CACHE Select email FROM customers where state=‘MA’ AND country=‘USA’; 9 Optimizer determines plan with help of directive SHARED_POOL Cursor 0: Select email from customers 7 New SQL statement is submitted on customers table 10 Optimizer adds column group creation for state & country columns to next statistics gather on customer table 8 SQL Directives Optimizer checks for directives on customers table and finds one on the columns state and country SysauxTablespace Sqlid Binds Stats Extension DIRECTIVE CACHE

  35. Summary • Optimizer begins to learn from its experiences • Adaptive Plans • Adapt join methods • Adapt parallel distribution methods • Adaptive Statistics • Dynamic statistics • Re-optimization • SQL Plan Directives

  36. “With an ever-expanding portfolio of tools and applications, Oracle University is strongly oriented toward the individual learner.” 2013 leader inIT EducationAccording toIndustry Analyst Firm IDC “Of the vendors evaluated, Oracle is most adept at describing the value of training at multiple parts of the technology deployment life cycle.” “This decentralized control [of the curricula development] helps maximize the relevance of the training offering to the elements that are most useful to end users.”

  37. One of the World’s Largest Training Organizations Worldwide classrooms: 200+ in 89 countries Courses: 2000+ Languages supported: 28 Classes per week: 600+, 90+ Virtual Classes On Demand Courses: 100 and growing Students trained per year:400,000+ Global education partnerships:500+ Oracle certified professionals: 1.6+ million

  38. Training & Certification for the Entire Stack ROLE-BASED TRAINING Administrators Developers Business Users Architects Implementers WHEN AND HOW YOU NEED IT In-Class Training On Demand Private Live Virtual Class VALIDATED BY INDUSTRY-RECOGNIZED CERTIFICATIONS Oracle Certified Professional Oracle Certified Expert Oracle Certified Associate Oracle Certified Master

  39. More than 2000 Courses Across 50+ Products

  40. 160+ Certification Credentials Available

  41. Contact Oracle University at: Telephone: 800 810 9931 orEmail: OUChina_CN@oracle.com Website: www.oracle.com/cn/education 2013 leader in IT Education education.oracle.com

More Related