Cost model and estimating result sizes
This presentation is the property of its rightful owner.
Sponsored Links
1 / 52

Cost Model and Estimating Result Sizes PowerPoint PPT Presentation


  • 67 Views
  • Uploaded on
  • Presentation posted in: General

Cost Model and Estimating Result Sizes. מודל המחיר Cost Model. בהרצאה הראנו איך לחשב את המחיר של כל שיטה (join) כדי לעשות זאת צריך לדעת את גודל היחסים, שחלקם מתקבלים כתוצאות ביניים לפיכך, יש צורך לחשב את הגודל של תוצאות ביניים עכשיו נסביר איך מעריכים את גודל התוצאה.

Download Presentation

Cost Model and Estimating Result Sizes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cost model and estimating result sizes

Cost Modeland Estimating Result Sizes


Cost model

מודל המחירCost Model

  • בהרצאה הראנו איך לחשב את המחיר של כל שיטה(join)

  • כדי לעשות זאת צריך לדעת את גודל היחסים, שחלקם מתקבלים כתוצאות ביניים

  • לפיכך, יש צורך לחשב את הגודל של תוצאות ביניים

  • עכשיו נסביר איך מעריכים את גודל התוצאה


Cost model and estimating result sizes

בחירת תוכנית לחישוב צירוף של שלושה יחסים

  • רוצים לחשב צירוף של שלושת היחסים: Reserves , Sailors ו- Boats

  • שתי האפשרויות (תוך התעלמות מסדר היחסים בפעולת הצירוף הראשונה) הנן:

    (Sailors  Reserves)  Boats

    Sailors  (Reserves  Boats)

  • ההחלטה מהי התוכנית הזולה יותר תלויה בין היתר בשאלה איזה תוצאת ביניים הנה קטנה יותר


Cost model and estimating result sizes

אנליזה של גודל התוצאות

  • צריך להעריך את גודל התוצאה של הצירוף (Sailors  Reserves) לעומת גודל התוצאה של הצירוף (Reserves  Boats)

  • ה- DBMS שומר סטטיסטיקות לגבי היחסים והאינדקסים


Statistics maintained by dbms

Statistics Maintained by DBMS

  • Cardinality: Number of tuples NTuples(R) in each relation R

  • Size: Number of pages NPages(R) in each relation R

  • Index Cardinality: Number of distinct key values NKeys(I) for each index I

  • Index Size: Number of pages INPages(I) in each index I

  • Index Height: Number of non-leaf levels IHeight(I) in each B+ Tree index I

  • Index Range: The minimum value ILow(I) and maximum value IHigh(I) for each index I


N ote

Note

  • The statistics are updated periodically (not every time the underlying relations are modified).

  • We cannot use the cardinality for computing

    select count(*)

    from R


Estimating result sizes

Estimating Result Sizes

SELECT attribute-list

FROM relation-list

WHERE term1 and ... and termn

  • Consider

  • The maximum number of tuples is the product of the cardinalities of the relations in the FROM clause

  • The WHERE clause is associating a reduction factor with each term. It reflects the impact of the term in reducing result size.


Result size

Result Size

  • Estimated result size:

    maximum size

    X

    the product of the reduction factors


Assumptions

Assumptions

  • There is an index I1 on R.Y and index I2 on S.Y

  • Containment of value sets: if NKeys(I1)<NKeys(I2) for attribute Y, then every Y-value of R will be a Y-value of S


Estimating reduction factors

Estimating Reduction Factors

  • column = value: 1/NKeys(I)

    • There is an index I on column.

    • This assumes a uniform distribution.

    • Otherwise, use 1/10.

  • column1 = column2: 1/Max(NKeys(I1),NKeys(I2))

    • There is an index I1 on column1and an index I2 on column2.

    • Containment of value sets assumption

    • If only one column has an index, we use it to estimate the value.

    • Otherwise, use 1/10.


Estimating reduction factors1

Estimating Reduction Factors

  • column > value: (High(I)-value)/(High(I)-Low(I)) if there is an index I on column.


Example

Example

Reserves (sid, agent), Sailors(sid, rating)

SELECT *

FROM Reserves R, Sailors S

WHERE R.sid = S.sid and S.rating > 3 and

R.agent = ‘Joe’

  • Cardinality(R) = 100,000

  • Cardinality(S) = 40,000

  • NKeys(Index on R.agent) = 100

  • High(Index on Rating) = 10, Low = 0


Example cont

Example (cont.)

  • Maximum cardinality: 100,000 * 40,000

  • Reduction factor of R.sid = S.sid: 1/40,000

    • sid is a primary key of S

  • Reduction factor of S.rating > 3: (10–3)/(10-0) = 7/10

  • Reduction factor of R.agent = ‘Joe’: 1/100

  • Total Estimated size: 700


Database tuning

Database Tuning


Database tuning1

Database Tuning

  • Problem: Make database run efficiently

  • 80/20 Rule: 80% of the time, the database is running 20% of the queries

    • find what is taking all the time, and tune these queries


Solutions

Solutions

  • Indexing

    • this can sometimes degrade performance. why?

  • Tuning queries

  • Reorganization of tables; perhaps "denormalization"

  • Changes in physical data storage


Denormalization

Denormalization

  • Suppose you have tables:

    • emp(eid, ename, salary, did)

    • dept(did, budget, address, manager)

  • Suppose you often ask queries which require finding the manager of an employee. You might consider changing the tables to:

    • emp(eid, ename, salary, did, manager)

    • dept(did, budget, address, manager)

      - in emp, there is an fd did -> manager. It is not 3NF!


Denormalization cont d

Denormalization (cont’d)

  • How will you ensure that the redundancy does not introduce errors into the database?


Creating indexes using oracle

Creating Indexes Using Oracle


Index

Index

  • Map between

    • a key of a row

    • the location of the data on the row

  • Oracle has two kinds of indexes

    • B+ tree

    • Bitmap

  • Sorted


B tree

Root

30

13

17

24

39*

3*

5*

19*

20*

22*

24*

27*

38*

2*

7*

14*

16*

29*

33*

34*

B+ tree


Creating an index

Creating an Index

  • Syntax:

    create [bitmap] [unique] index index on table(column [,column] . . .)


Unique indexes

Unique Indexes

create unique index rating_bit on Sailors(rating);

  • Create an index that will guarantee the uniqueness of the key. Fail if any duplicate already exists.

  • When you create a table with a

    • primary key constraint or

    • unique constraint

      a "unique" index is created automatically


Bitmap indexes

Bitmap Indexes

  • Appropriate for columns that may have very few possible values

  • For each value c that appears in the column, a vector v of bits is created, with a 1 in v[i] if the i-th row has the value c

    • Vector length = number of rows

  • Oracle can automatically convert bitmap entries to RowIDs during query processing


Bitmap indexes example

Bitmap Indexes: Example

create bitmap index rating_bit on Sailors(rating);

  • Corresponding bitmaps:

    • 3: <1 0 0 1>

    • 7: <0 1 0 0>

    • 10: <0 0 1 0>


When to create an index

When to Create an Index

  • Large tables, on columns that are likely to appear in where clauses as a simple equality

  • where s.sname = ‘John’ and s.age = 50

  • where s.age = r.age


Function based indexes

Function-Based Indexes

  • You can't use an index on sname for the following query:

    select *

    from Sailors

    where UPPER(sname) = 'SAM';

  • You can create a function-based index to speed up the query:

    create index upp_sname on Sailors(UPPER(sname));


Index organized tables

Index-Organized Tables

  • An index organized table keeps its data sorted by the primary key

  • Rows do not have RowIDs

  • They store their data as if they were an index

    create table Sailors(

    sid number primary key,

    sname varchar2(30),

    age number,

    rating number)

    organization index;


Index organized tables 2

Index-Organized Tables (2)

  • What advantages does this have?

    • Enforce uniqueness: primary key

    • Improve performance

  • What disadvantages?

    • expensive to add column, dynamic data

  • When to use?

    • where clause on the primary key

    • static data


Clustering tables together

Clustering Tables Together

  • You can ask Oracle to store several tables close together on the disk

  • This is useful if you usually join these tables together

  • Cluster: area in the disk where the rows of the tables are stored

  • Cluster key: the columns by which the tables are usually joined in a query


Clustering tables together syntax

Clustering Tables Together: Syntax

  • create cluster sailor_reserves (X number);

    • Create a cluster with nothing in it

  • create table Sailors(

    sid number primary key,

    sname varchar2(30),

    age number,

    rating number)

    cluster sailor_reserves(sid);

    • create the table in the cluster


Clustering tables together syntax cont

Clustering Tables Together: Syntax (cont.)

  • create index sailor_reserves_index on cluster sailor_reserves

    • Create an index on the cluster

  • create table Reserves(

    sid number,

    bid number,

    day date,

    primary key(sid, bid, day) )

    cluster sailor_reserves(sid);

    • A second table is added to the cluster


The oracle optimizer

The Oracle Optimizer


Types of optimizers

Types of Optimizers

  • There are different modes for the optimizer

  • RULE: Rule-based optimizer (RBO)

    • deprecated

  • CHOOSE: Cost-based optimizer (CBO); picks a plan based on statistics (e.g. number of rows in a table, number of distinct keys in an index)

    • Need to analyze the data in the database using analyze command

ALTER SESSION SET optimizer_mode = {choose|rule|first_rows(_n)|all_rows}


Types of optimizers1

Types of Optimizers

  • ALL_ROWS: execute the query so that all of the rows are returned as quickly as possible

    • Merge join

  • FIRST_ROWS(n): execute the query so that all of the first n rows are returned as quickly as possible

    • Block nested loop join


Analyzing the data

Analyzing the Data

analyze table | index

<table_name> | <index_name>

compute statistics |

estimate statistics [sample <integer>

rows | percent] |

delete statistics;

analyze table Sailors estimate statistics sample 25 percent;


Viewing the execution plan option 1

Viewing the Execution Plan(Option 1)

  • You need a PLAN_TABLE table. So, the first time that you want to see execution plans, run the command:

  • Set autotrace on to see all plans

    • Display the execution path for each query, after being executed

@$ORACLE_HOME/rdbms/admin/utlxplan.sql


Viewing the execution plan option 2

Viewing the Execution Plan (Option 2)

explain plan set statement_id=‘<name>’

for <statement>

  • Another option:

explain plan

set statement_id='test'

for

SELECT *

FROM Sailors S

WHERE sname='Joe';

Select Plan_Table


Operations that access tables

Operations that Access Tables

  • TABLE ACCESS FULL: sequential table scan

    • Oracle optimizes by reading multiple blocks

    • Used whenever there is no where clause on a query

      select * from Sailors

  • TABLE ACCESS BY ROWID: access rows by their RowID values.

    • How do you get the rowid? From an index!

      select * from Sailors where sid > 10


Types of indexes

Types of Indexes

  • Unique: each row of the indexed table contains a unique value for the indexed column

  • Nonunique: the row’s indexed values can repeat


Operations that use indexes

Operations that Use Indexes

  • INDEX UNIQUE SCAN: Access of an index that is defined to be unique

  • INDEX RANGE SCAN: Access of an index that is not unique or access of a unique index for a range of values


When are indexes used not used

When are Indexes Used/Not Used?

  • If you set an indexed column equal to a value, e.g., sname = 'Jim'

  • If you specify a range of values for an indexed column, e.g., sname like 'J%'

    • sname like '%m': will not use an index

    • UPPER(sname) like 'J%' : will not use an index

    • sname is null: will not use an index, since null values are not stored in the index

    • sname is not null: will not use an index, since every value in the index would have to be accessed


When are indexes used cont

When are Indexes Used? (cont)

  • 2*age = 20: Index on age will not be used. Index on 2*age will be used.

  • sname != 'Jim': Index will not be used.

  • MIN and MAX functions: Index will be used

  • Equality of a column in a leading column of a multicolumn index. For example, suppose we have a multicolumn index on (sid, bid, day)

    • sid = 12: Can use the index

    • bid = 101: Cannot use the index


When are indexes used cont1

When are Indexes Used? (cont)

  • If the index is selective

    • A small number of records are associated with each distinct column value


Hints

Hints


Hints1

Hints

  • You can give the optimizer hints about how to perform query evaluation

  • Hints are written in /*+ */ right after the select

  • Note: These are only hints. The oracle optimizer can choose to ignore your hints


Hints2

Hints

  • FULL hint: tell the optimizer to perform a TABLE ACCESS FULL operation on the specified table

  • ROWID hint: tell the optimizer to perform a TABLE ACCESS BY ROWID operation on the specified table

  • INDEX hint: tells the optimizer to use an index-based scan on the specified table


Examples

Examples

Select /*+ FULL (sailors) */ sid

From sailors

Where sname=‘Joe’;

Select /*+ INDEX (sailors) */ sid

From sailors

Where sname=‘Joe’;

Select /*INDEX (sailors s_ind) */ sid

From sailors S, reserves R

Where S.sid=R.sid AND sname=‘Joe’;


  • Login