1 / 23

Self-Tuning Database systems

Self-Tuning Database systems. Wang Haocong Jan 8, 2009. Tuning databases. Logical database design Physical database design (indexes) Memory, disks Materialized Views Partitioning. Self-Tuning Systems. Databases are complicated! Schema design is hard Lots of “knobs” to tweak

akando
Download Presentation

Self-Tuning Database systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Self-Tuning Database systems Wang Haocong Jan 8, 2009

  2. Tuning databases • Logical database design • Physical database design (indexes) • Memory, disks • Materialized Views • Partitioning

  3. Self-Tuning Systems • Databases are complicated! • Schema design is hard • Lots of “knobs” to tweak • Need appropriate information • Does the DB approach give us more ability to “self-tune” than some other approach (e.g., Java)?

  4. What Would We Like to Auto-Tune? • Query optimization – statistics, bad decisions, … • The schema itself? • Indices • Auxiliary materialized views • Data partitioning • Perhaps logging?

  5. What Are The Challenges in Building Adaptive Systems? • Really, a generalization of those in adaptive query processing • Information gathering – how do we get it? • Extrapolating – how do we do this accurately and efficiently? • Sampling or piloting • Minimizing the impact of mistakes if they happen • Using app-specific knowledge

  6. Who’s Interested in these Problems? • Oracle: • Materialized view “wizard” • Microsoft “AutoAdmin”: • Index selection, materialized view selection • Stats on materialized views • Database layout • IBM SMART (Self-Managing And Resource Tuning): • Histogram tuning (“LEO” learning optimizer) • Partitioning in clusters • Index selection • Adaptive query processing

  7. A Particular Instance: Microsoft’s Index Tuning Wizard • Why not let the system choose the best index combination(s) for a workload • The basic idea: • Log a whole bunch of queries that are frequently run • See what set of indices is best • Why is this hard? Why not index everything? • Create these indices with little or no human input

  8. Possible Approaches • Obviously: only consider indices that would be useful • The optimizer can “tell” which indices it might use in executing a query • But that continues to be a lot of indices! • Can exhaustively compare all possible indices • Note that indices can interact (esp. for updates) • How do we compare costs and benefits of indices? • Execute for real • Use optimizer cost model with whatever stats we have • Gather some stats (e.g., build histograms, sample) and use cost model

  9. Physical design

  10. SQL Server Architecture

  11. Their Approach in More Detail • For a workload of n queries: • Generate a separate workload with each query • Evaluate the candidate indices for this query to find the best “configuration” – limited to 2 indices, 2 tables, single joins • Candidate index set for workload is the union of all configurations • Too expensive to enumerate all; use a greedy algorithm: • Exhaustively enumerate (using optimizer) best m-index configuration • Pick a new index I to add, which seems to save cost relative to adding some other I’ or to the current cost • Repeat until we’ve added “enough” k indices • “Despite interaction among indices, the largest cost reductions often result from indices that are good candidates by themselves” • They iteratively expand to 2-column indices – index on leading column must be desirable for this to be desirable

  12. Further Enhancements • Use the tool for “what-if” analysis • What if a table grows by a substantial amount? • Supplement with extra info gathered from real query execution • Maybe we can “tweak” estimates for certain selectivities • An attempt to compensate for the “exponential error” problem

  13. Physical tuning tool • Decide when to tune • Decide what “representative” workload • Run the tool and examine the recommended physical design changes • Implement them if appropriate

  14. Alternative Tuning Models • Alerter • When to tune • Light weight tools • Workload as a Sequence • Read/update queries • Create/drop physical structures

  15. Dynamic tuning

  16. Dynamic tuning • Low overhead, not interfere with normal functioning of DBMS • Balance cost of transitioning and potential benefits • Avoid unwanted oscillations

  17. Impact • Tuning Large Workloads • Partition • Sample • Tuning Production Servers • Test server

  18. Future Directions • Ability to compare the quality of automated physical design solutions • Light weight approaches • Machine learning techniques, control theory and online algorithms

  19. Memory tuning in DB2 • Innovative cost-benefit analysis • Simulation technique vs. modeling • Tunes memory distribution and total memory usage • Simple greedy memory tuner • Control algorithms to avoid oscillations • Performs very well in experiments • For both OLTP and DSS

  20. SBPX Operation 9. Stop timer Disk SBPX Buffer Pool 3. Page request for 5. Check SBPX 1. Victimize Page (move to SBPX) 2. Load new page from disk 7. Victimize BP page (send to SBPX) 4. Check Bufferpool 6. Start timer 8. Load page from disk

  21. Experimental results – tuning a static workload BP Size

  22. 7000 avg = 6206 6000 Reduce 63% 5000 4000 Some Indexes Dropped avg = 2285 Time in seconds 3000 2000 avg = 959 1000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Order of execution Experimental results – workload shift

  23. Thank you!

More Related