1 / 21

Examples of Physical Query Plan Alternatives

This article explores various physical query plan alternatives selected from chapters 12, 14, and 15, focusing on query optimization and evaluating costs.

Download Presentation

Examples of Physical Query Plan Alternatives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Examples of Physical Query Plan Alternatives Selected Material from Chapters 12, 14 and 15

  2. Query Optimization • NOTE: SQL provides many ways to express a query. • HENCE: System has many options for evaluating a query. • Optimizer is important for query performance: • Generates alternative plans • Chooses plan with least estimated cost. • Ideally, find best plan. • Realistically, consistently find a quite good one.

  3. A Query Evaluation Plan • An extended relational algebra tree • Annotations at each node indicate: • access methods to use for each table. • implementation methods used for each relational operator.

  4. sname rating > 5 bid=100 sid=sid Sailors Reserves (On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Simple Nested Loops) sid=sid Sailors Reserves A Query Evaluation Plan

  5. Query Optimization • Multi-operator Queries: Pipelined Evaluation • On-the-fly: The result of one operator is pipelined to another operator without creating a temporary table to hold intermediate result, called on-the-fly. • Materialized : Otherwise, intermediate results must be materialized before the next operator can access it. C B A

  6. Alternative Plans: Schema Examples Reserves (sid: integer, bid: integer, day: dates, rname: string) Sailors (sid: integer, sname: string, rating: integer, age: real) • Reserves: • Each tuple is 40 bytes long, • 100 tuples per page, • 1000 pages. • Sailors: • Each tuple is 50 bytes long, • 80 tuples per page, • 500 pages.

  7. sname rating > 5 bid=100 sid=sid Sailors Reserves Alternative Plans: Motivating Example SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 RA Tree:

  8. sname rating > 5 bid=100 sid=sid Sailors Reserves (On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Simple Nested Loops) sid=sid Sailors Reserves RA Tree: SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Costs : 1. Scan Sailors : • For each page of Sailors, scan Reserves • 500+500*1000 I/Os • Or, 2. Scan Reserves • For each page of Reserves, scan Sailors • 1000+1000 * 500 I/Os Plan:

  9. (On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Simple Nested Loops) sid=sid Sailors Reserves Alternative Plans: Motivating Example SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 • Cost: 500+500*1000 I/Os • Typically, bad plan ! • Reasons : • selections could be `pushed’ earlier, • no use made of indexes • Goal of optimization: Find more efficient plan Plan:

  10. sname bid=100 rating > 5 sid=sid Sailors Reserves Alternative Plans - 2 -- No Indexes • Main idea : push selects down.

  11. (On-the-fly) sname (Sort-Merge Join) sid=sid (Scan; (Scan; write to write to rating > 5 bid=100 temp T2) temp T1) Reserves Sailors Alternative Plans - 2 -- No Indexes • Main idea : push selects down.

  12. (On-the-fly) sname (Sort-Merge Join) sid=sid (Scan; (Scan; write to write to rating > 5 bid=100 temp T2) temp T1) Reserves Sailors Alternative Plan - 2 • With 5 buffer pages, cost of plan: • Scan Reserves (1000) + write temp T1 (if we have 100 boats, uniform distribution then it is : 10 pages,). • Scan Sailors (500) + write temp T2 ( if we have 10 ratings then it is : 250 pages). • Sort T1 (2*2*10), sort T2 (2*4*250), merge (10+250) • Total: 4060 page I/Os.

  13. (On-the-fly) sname (Sort-Merge Join) sid=sid (Scan; (Scan; write to rating > 5 write to bid=100 temp T2) temp T1) Reserves Sailors Alternative Plans - 2 temp T1 temp T2 • With 5 buffer pages, • Scanning and filtering: • 1010 + 750 IOs • Optimization1: block nested loops join: • join cost = 10+4*250, total cost = 2770. • Optimization2: `push’ projections: • T1 only sid, 10/4=[2.5]=3;T1 fits in 3 pages, • T2 only sid and sname, 250/2=125 pages • cost of BNL drops to 125 IOs, • Total cost < 2000 IOs

  14. sname sid=sid rating > 5 bid=100 Reserves Sailors Alternative Plan : Using Indices? Push Selections Down ? What Indices help here? • Index on Reserves.bid? • Index on Sailors.rating? • Index on Sailors.sid? • Index on Reserves.sid?

  15. (On-the-fly) sname (On-the-fly) rating > 5 sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Example Plan : With Index • With index on Reserves.bid : Assume 100 bid values. Assume 100,000 tuples. Assume 100 tuples/disk • We get 100,000/100 = 1000 tuples • On 1000/100 = 10 disk pages. • If index clustered, Cost = 10 I/Os.

  16. (On-the-fly) sname (On-the-fly) rating > 5 sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Example Plan : With Index • With index on Reserves.bid : Assume 100 bid values. Assume 100,000 tuples. Assume 100 tuples/disk • We get 100,000/100 = 1000 tuples • On 1000/100 = 10 disk pages. • If index clustered, Cost = 10 I/Os.

  17. (On-the-fly) sname (On-the-fly) rating > 5 (Index Nested Loop Join, with pipelining ) sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Example Plan : Use Another Index • Index on Sailors? Which? • Selection on Sailors may reduce number of tuples considered in join. • But then requires us to materialize the Sailor tuples again ?

  18. (On-the-fly) sname (On-the-fly) rating > 5 (Index Nested Loops, with pipelining ) sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Index Nested Loop with Pipelining: • Outer is not materialized • Projecting out unnecessary fields from outer doesn’t help

  19. (On-the-fly) sname (On-the-fly) rating > 5 (Index Nested Loops, with pipelining ) sid=sid Sailors bid=100 Reserves Example Plan Continued • Index on Sailors.sid : • sid is key for Sailors. • At most one matching tuple, • unclustered on sid is OK. • Cost? • For each Reserves tuples (1000): • get matching Sailors tuple (1.2 I/O). • So total 1200 + 10 IOs.

  20. (On-the-fly) sname (On-the-fly) rating > 5 (Index Nested Loops, with pipelining ) sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Alternative Plan : With Second Index • Selection Push down? • Push (rating>5) before join ? • Answer: • No, because of availability of sid index on Sailors. • Reason : • No index on selection result. • Then lookup requires scan Sailors.

  21. Summary • A query is evaluated by converting it to a tree of operators and evaluating the operators in the tree. • There are alternative evaluation algorithms for each relational operator. • Query evaluation must compare alternative plans based on their estimated costs • Must understand query optimization to understand performance impact of a given database design on a query workload

More Related