tuning database configuration parameters with ituned n.
Skip this Video
Loading SlideShow in 5 Seconds..
Tuning Database Configuration Parameters with iTuned PowerPoint Presentation
Download Presentation
Tuning Database Configuration Parameters with iTuned

Loading in 2 Seconds...

play fullscreen
1 / 30

Tuning Database Configuration Parameters with iTuned - PowerPoint PPT Presentation

  • Uploaded on

Tuning Database Configuration Parameters with iTuned. Vamsidhar Thummala Collaborators: Songyun Duan, Shivnath Babu Duke University. Performance Tuning of Database Systems. Physical design tuning Indexes [SIGMOD’98, VLDB’04] Materialized views [SIGMOD’00, VLDB’04]

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Tuning Database Configuration Parameters with iTuned' - delilah

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
tuning database configuration parameters with ituned

Tuning Database Configuration Parameters with iTuned

Vamsidhar Thummala

Collaborators: Songyun Duan, Shivnath Babu

Duke University

performance tuning of database systems
Performance Tuning of Database Systems
  • Physical design tuning
    • Indexes [SIGMOD’98, VLDB’04]
    • Materialized views [SIGMOD’00, VLDB’04]
    • Partitioning [SIGMOD’82, SIGMOD’88, SIGMOD’89]
  • Statistics tuning [ICDE’00, ICDE’07]
  • SQL Query tuning [VLDB’04 ]
  • Configuration parameter or Server parameter tuning

[This talk]

database configuration parameters
Database Configuration Parameters
  • Parameters that control
    • Memory distribution
      • shared_buffers, work_mem
    • I/O optimization
      • wal_buffers, checkpoint_segments, checkpoint_timeout, fsync
    • Parallelism
      • max_connections
    • Optimizer’s cost model
      • effective_cache_size, random_page_costdefault_statistics_target, enable_indexscan
need for automated configuration parameter tuning 2 2
Need for Automated Configuration Parameter Tuning (2/2)
  • Number of threads related to configuration parameter tuning vs. other under “PostgreSQL performance” mailing list

Recently, there has been some effort from community to summarize the important parameters [PgCon’08]

typical approach trial and error
Typical Approach: Trial and Error



All the values seem quite reasonable to me. What about the _costvariables? I guess one or more queries are evaluated using a different execution plan, probably sequential scan instead of index scan, hash join instead of merge join, or something like that. Try to log the "slow" statements - see "log_min_statement_duration". That might give you slow queries (although not necessarily the ones causing problems), and you can analyze them. What is the general I/O activity? Is there a lot of data read/written to the disks, is there a lot of I/O wait?

PS: Was the database analyzedrecently?



Hi, list. I've just upgraded pgsql from 8.3 to 8.4. I've used pgtune before and everything worked fine for me. And now i have ~93% cpuload. Here's changed values of config:

default_statistics_target= 50

maintenance_work_mem = 1GB

constraint_exclusion = on

checkpoint_completion_target= 0.9

effective_cache_size = 22GB

work_mem = 192MB

wal_buffers= 8MB

checkpoint_segments= 16

shared_buffers = 7680MB

max_connections = 80

My box is Nehalem 2xQuad 2.8 with RAM32Gb, and there's only postgresql working on it.

What parameters I should give more attention on?

*http://archives.postgresql.org/pgsql-performance/2009-07/msg00323.php, 30th Jul 2009

doing experiments to understand the underlying response surface
Doing Experiments to Understand the Underlying Response Surface
  • TPC-H 4 GB database, 1 GB memory, Query 18
  • Large number of configuration parameters
    • Total ~ 100
    • 10-15 are important depending on OLTP vs. OLAP
    • Brute-Force will not work
      • Results in exponential number of experiments
  • Parameters can have complex interactions
    • Sometimes non-monotonic and counterintuitive
    • Limits the one-parameter-at-a-time approach
  • No holistic configuration tuning tools
    • Existing techniques focus on specific memory related parameters or recommend default settings
our solution ituned
Our Solution: iTuned
  • Practical tool that uses planned experiments to tune configuration parameters
    • An adaptive sampling algorithmto plan the sequence of experiments (Planner)
    • A novel workbench for conducting experiments in enterprise systems (Executor)
    • Features for scalability like sensitivity analysis and use of parallel experiments to reduce
      • Total number of experiments
      • Per experiment running time/cost
outline of the talk
Outline of the talk
  • iTuned Planner
  • iTuned Executor
  • Evaluation
  • Conclusion
problem abstraction
Problem Abstraction
  • Given
    • A database D and workload W
    • Configuration Parameter Vector X = <x1, x2, …, xm >
    • Cost Budget R
    • Goal: Find high performance setting X* subject to the budget constraint
    • Problem: Response surface y = (X) is unknown
  • Solution Approach:
    • Conduct experiments to learn about the response surface
    • Each experiment has some cost and gives sample <Xi, yi >
ituned planner
iTuned Planner
  • Uses an adaptive sampling algorithm

Stopping Criteria:

Based on cost budget, R



  • Sequential Sampling:
  • Select NEXT experiment, XNEXT based on previous samples
    • Calculate the improvement, IP(X) of each candidate sample and select the sample with highest improvement as XNEXT

Boot Strapping:

Conduct initial set of experiments

Latin Hypercube Sampling

k-Furthest First

improvement of an experiment
Improvement of an Experiment
  • Improvement IP(X) is defined as:
    • y(X*) – y(X) if y(X) < y(X*)
    • 0 otherwise
  • Issue: IP(X) is known only after y(X) is known, i.e., an experiment has to be conducted at X to measure y(X)
  • We estimate IP(X) by calculating the Expected Improvement, EIP(X)
  • To calculate EIP(X), we need to approximate

Improvement at each configuration setting

Probability density function of

(Uncertainty estimate)

conducting experiment at x next using expected improvement


Conducting Experiment at XNEXT using Expected Improvement

Projection on 1D

Conduct NEXT experiment here

generating pdf through gaussian process
Generating pdf through Gaussian Process
  • We estimate the performance as
    • Where is a regression model, is the residual of the model, captured through Gaussian Process
  • Gaussian Process, captures the uncertainty of the surface
    • is specified by mean and covariance functions
    • We use zero-mean Gaussian process
    • Covariance is a kernel function that inversely depends on the distance between two samples Xi and Xj
      • Residuals at nearby points exhibit higher correlation
calculating expected improvement using gaussian process
Calculating Expected Improvement using Gaussian Process
  • Lemma: Gaussian Process models as a uni-variate Gaussian with mean and variance as
  • Theorem: There exists a closed form for EIP(X)
  • [See paper for proof and details]
tradeoff between exploration vs exploitation
Tradeoff between Exploration vs. Exploitation
  • Settings X with high EIP are either
    • Close to known good settings
      • Assists in exploitation
    • In highly uncertain regions
      • Assists in exploration


Gaussian Process tries to achieve the balance between exploration vs. exploitation

outline of the talk1
Outline of the talk
  • iTuned Planner
  • iTuned Executor
  • Evaluation
  • Conclusion
goal of the executor
Goal of the Executor
  • To conduct experiments
    • Without impacting production system
    • As close to real production runs as possible
  • Traditional choices
    • Production system itself
      • May impact running applications
    • Test system
      • Hard to replicate exact production settings
      • Manual set-up
ituned executor
iTuned Executor
  • Exploits the underutilized resources to conduct experiments
    • Production systems, Stand-by systems, Test systems, On the cloud
  • Design:
    • Mechanisms: Home & garage containers, efficient snapshots of data
    • Policies: Specified by admins
      • If CPU, memory, disk utilization is below 20% for the past 10 minutes, then 70% resources can be taken for experiments
example mechanism set up on stand by system using zfs solaris and pitr





Example Mechanism set-up on Stand-by System using ZFS, Solaris, and PITR




Standby Environment


Standby Machine



Middle Tier

Apply WAL continuously

Workbench for conducting experiments

Write Ahead Log shipping





Copy on Write


Policy Manager


Experiment Planner & Scheduler


outline of the talk2
Outline of the talk
  • iTuned Planner
  • iTuned Executor
  • Evaluation
  • Conclusion
empirical evaluation 1
Empirical Evaluation (1)
  • Two database systems, PostgreSQL v8.2 and MySQL v5.0
  • Cluster of machines with 2GHz processor and 3GB RAM
  • Mixture of workloads
    • OLAP: Mixes of TPC-H queries
      • Varying #queries, #query_types, and MPL
      • Varying scale factors (SF = 1 to SF = 10)
    • OLTP: TPC-W and RuBIS
  • Number of parameters varied: up to 30
empirical evaluation 2
Empirical Evaluation (2)
  • Techniques compared
    • Default parameter settings shipped (D)
    • Manual rule-based tuning (M)
    • Smart Hill Climbing (S)
      • State-of-the-art technique
    • Brute-Force search (B)
      • Run many experiments to find approximation to optimal setting
    • iTuned (I)
  • Evaluation metrics
    • Quality: workload running time after tuning
    • Efficiency: time needed for tuning
comparison of tuning quality
Comparison of Tuning Quality

Simple Workload with one TPC-H Query (Q1)

Complex Workload with mix of TPC-H Queries (Q1+Q18)

ituned s efficiency and scalability
iTuned’s Efficiency and Scalability
  • Run experiments in parallel
  • Abort low-utility experiments early
ituned s sensitivity analysis
iTuned’s Sensitivity Analysis
  • Identify important parameters quickly
    • Use Sensitivity Analysis to reduce experiments
related work
Related work
  • Parameter tuning
    • Focus on specific classes of parameters (mainly memory related buffer pools) [ACM TOS’08, VLDB’06]
    • Statistical Approach for Ranking Parameters [SMDB’08]
    • Brute force approach to experiment design
  • Tools like DB2 Configuration advisor and pg_tune recommend default settings
  • Adaptive approaches to sampling [SIGMETRICS’06]
  • Work related to iTuned’s executor
    • Oracle SQL Performance Analyzer [SIGMOD’09, ICDE’09]
    • Virtualization, snapshots, suspend-resume
  • iTuned automates the tuning process by adaptively conducting experiments
    • Our initial results are promising
  • Future work
    • Apply database-specific knowledge to keep optimizer in loop for end-to-end tuning
      • Query plan information
    • Workload compression
    • Experiments in cloud
  • Thank You