1 / 13

Query Optimization and Perspectives

Query Optimization and Perspectives. March 12 th , 2003. Administration. Exam next Wednesday, 2:30pm. Special office hours next week will be announced. Problem. Given: a query R1 R2 … Rn Assume we have a function cost() that gives us the cost of every join tree

heller
Download Presentation

Query Optimization and Perspectives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Optimization and Perspectives March 12th, 2003

  2. Administration • Exam next Wednesday, 2:30pm. • Special office hours next week will be announced.

  3. Problem • Given: a query R1 R2 … Rn • Assume we have a function cost() that gives us the cost of every join tree • Find the best join tree for the query

  4. Dynamic Programming • For each subquery Q ⊆ {R1, …, Rn} compute the following: • Size(Q) • A best plan for Q: Plan(Q) • The cost of that plan: Cost(Q)

  5. Dynamic Programming • Step 1: For each {Ri} do: • Size({Ri}) = B(Ri) • Plan({Ri}) = Ri • Cost({Ri}) = (cost of scanning Ri)

  6. Dynamic Programming • Step i: For each Q ⊆ {R1, …, Rn} of cardinality i do: • Compute Size(Q) • For every pair of subqueries Q’, Q’’ s.t. Q = Q’ U Q’’compute cost(Plan(Q’) Plan(Q’’)) • Cost(Q) = the smallest such cost • Plan(Q) = the corresponding plan

  7. Dynamic Programming • Return Plan({R1, …, Rn})

  8. Dynamic Programming • Summary: computes optimal plans for subqueries: • Step 1: {R1}, {R2}, …, {Rn} • Step 2: {R1, R2}, {R1, R3}, …, {Rn-1, Rn} • … • Step n: {R1, …, Rn} • We used naïve size/cost estimations • In practice: • more realistic size/cost estimations • heuristics for Reducing the Search Space • Restrict to left linear trees • Restrict to trees “without cartesian product” • need more than just one plan for each subquery: • “interesting orders”

  9. Plan with Materialization HashTable  Srepeat read(R, x) y  join(HashTable, x) write(V1, y) HashTable  T repeat read(V1, y) z  join(HashTable, y) write(V2, z) HashTable  Urepeat read(V2, z) u  join(HashTable, z) write(Answer, u) ⋈ V2 ⋈ U V1 ⋈ T R S

  10. pipeline Plan with Pipelining ⋈ HashTable1  SHashTable2  T HashTable3  Urepeat read(R, x) y  join(HashTable1, x) z  join(HashTable2, y) u  join(HashTable3, z) write(Answer, u) ⋈ U ⋈ T R S

  11. Key Lessons in Optimization • There are many approaches and many details to consider in query optimization • Classic search/optimization problem! • Not completely solved yet! • Main points to take away are: • Algebraic rules and their use in transformations of queries. • Deciding on join ordering: System-R style (Selinger style) optimization. • Estimating cost of plans and sizes of intermediate results.

  12. The Course in Perspective • The relational data model, SQL • Views, updates, transactions • Conceptual design • Expressing the constraints on the domain • Using them to get good schema designs • XML • A format for data exchange. Has its own query language. • Basis for web services.

  13. Perspective (continued) • Building a database system: • Storage and indexing • Query execution (join algorithms) • Query optimization • Data integration and sharing: • Databases were created independently • Schemas need to be matched • Need to access multiple databases in one query. • Interoperability through web services.

More Related