1 / 23

ORACLE COST-BASED OPTIMIZER ADVANCED

ORACLE COST-BASED OPTIMIZER ADVANCED. Randolf Geist http://oracle-randolf.blogspot.com info@sqltools-plusplus.org. ABOUT ME. Independent consultant Available for consulting In-house workshops Cost-Based Optimizer Performance By Design Performance Troubleshooting Oracle ACE Director

nerina
Download Presentation

ORACLE COST-BASED OPTIMIZER ADVANCED

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ORACLE COST-BASED OPTIMIZER ADVANCED Randolf Geist http://oracle-randolf.blogspot.com info@sqltools-plusplus.org

  2. ABOUT ME • Independent consultant • Available for consulting • In-house workshops • Cost-Based Optimizer • Performance By Design • Performance Troubleshooting • Oracle ACE Director • Member of OakTable Network

  3. OPTIMIZER BASICS • Three main questions you should ask when looking for an efficient execution plan: • How much data? How many rows / volume? • How scattered / clustered is the data? • Caching? => Know your data!

  4. OPTIMIZER BASICS • Why are these questions so important? • Two main strategies: • One “Big Job” => How much data, volume? • Few/many “Small Jobs” => How many times / rows?=> Effort per iteration? Clustering / Caching

  5. OPTIMIZER BASICS • Optimizer’s cost estimate is based on: • How much data? How many rows / volume? • How scattered / clustered? (partially) • (Caching?) Not at all

  6. BASICS’ SUMMARY • Cardinality and Clustering determine whether the “Big Job” or “Small Job” strategy should be preferred • If the optimizer gets these estimates right, the resulting execution plan will be efficient within the boundaries of the given access paths • Know your data and business questions

  7. AGENDA • Clustering Factor • Statistics / Histograms • Datatype issues

  8. HOW SCATTERED / CLUSTERED? 1,000 rows => visit 1,000 table blocks: 1,000 * 5ms = 5 s

  9. HOW SCATTERED / CLUSTERED? 1,000 rows => visit 10 table blocks: 10 * 5ms = 50 ms

  10. HOW SCATTERED / CLUSTERED? • There is only a single measure of clustering in Oracle: The index clustering factor • The index clustering factor is represented by a single value • The logic measuring the clustering factor by default does not cater for data clustered across few blocks (ASSM!)

  11. HOW SCATTERED / CLUSTERED? • Challenges • Getting the index clustering factor right • There are various reasons why the index clustering factor measured by Oracle might not be representative • Multiple freelists / freelist groups (MSSM) • ASSM • Partitioning • SHRINK SPACE effects

  12. HOW SCATTERED / CLUSTERED? Re-visiting the same recent table blocks

  13. STATISTICS • Don’t use ANALYZE … COMPUTE / ESTIMATE STATISTICS anymore • Basic Statistics:- Table statistics: Blocks, Rows, Avg Row LenNothing to configure there, always generated- Basic Column Statistics: Low / High Value, Num Distinct, Num Nulls=> Controlled via METHOD_OPT option ofDBMS_STATS.GATHER_TABLE_STATS

  14. STATISTICS Controlling column statistics via METHOD_OPT • If you see FOR ALL INDEXED COLUMNS [SIZE > 1]:Question it! Only applicable if the author really knows what he/she is doing! => Without basic column statistics Optimizer is resorting to hard coded defaults! • Default in previous releases:FOR ALL COLUMNS SIZE 1: Basic column statistics for all columns, no histograms • Default from 10g on:FOR ALL COLUMNS SIZE AUTO: Basic column statistics for all columns, histograms if Oracle determines so

  15. HISTOGRAMS • Basic column statistics get generated along with table statistics in a single pass(almost) • Each histogram requires a separate pass • Therefore Oracle resorts to aggressive sampling if allowed => AUTO_SAMPLE_SIZE • This limits the quality of histograms and their significance

  16. HISTOGRAMS • Limited resolution of 255 value pairs maximum • Less than 255 distinct column values => Frequency Histogram • More than 255 distinct column values =>Height Balanced Histogram • Height Balanced is always a sampling of data, even when computing statistics!

  17. FREQUENCY HISTOGRAMS • SIZE AUTO generates Frequency Histograms if a column gets used as a predicate and it has less than 255 distinct values • Major change in behaviour of histograms introduced in 10.2.0.4 / 11g • Be aware of new “value not found in Frequency Histogram” behaviour • Be aware of edge case of very popular / unpopular values

  18. HEIGHT BALANCED HISTOGRAMS SELECT SKEWED_NUMBER FROM T ORDER BY SKEWED_NUMBER Endpoint Rows Rows 1 1,000 1,000 80,000 120,000 200,000 280,000 240,000 360,000 400,000 160,000 320,000 40,000 10,000,000 rows 100 popular valueswith 70,000 occurrences 250 buckets each covering 40,000 rows (compute)250 buckets each covering approx. 22/23 rows (estimate) 70,000 2,000 2,000 2,000 140,000 3,000 3,000 3,000 210,000 4,000 4,000 280,000 4,000 5,000 5,000 350,000 6,000 … 6,000

  19. HEIGHT BALANCED HISTOGRAMS 7,000,000 1,000 10,000,000 100,000

  20. HEIGHT BALANCED HISTOGRAMS 7,000,000 1,000 10,000,000 100,000

  21. SUMMARY • Check the correctness of the clustering factor for your critical indexes • Oracle does not know the questions you ask about the data • You may want to use FOR ALL COLUMNS SIZE 1 as default and only generate histograms where really necessary • You may get better results with the old histogram behaviour, but not always

  22. SUMMARY • There are data patterns that don’t work well with histograms when generated via Oracle • => You may need to manually generate histograms using DBMS_STATS.SET_COLUMN_STATS for critical columns • Don’t forget about Dynamic Sampling / Function Based Indexes / Virtual Columns / Extended Statistics • Know your data and business questions!

  23. QUESTIONS & ANSWERS Q & A

More Related