Loading in 5 sec....

Automated Parameter Setting Based on Runtime Prediction:PowerPoint Presentation

Automated Parameter Setting Based on Runtime Prediction:

- 83 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Automated Parameter Setting Based on Runtime Prediction:' - yoshi-walls

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Automated Parameter Setting Based on Runtime Prediction:

Towards an Instance-Aware Problem Solver

Frank Hutter, Univ. of British Columbia, Vancouver, Canada

Youssef Hamadi, Microsoft Research, Cambridge, UK

Motivation(1): Why automated parameter setting ?

- We want to use the best available heuristic for a problem
- Strong domain-specific heuristics in tree search
- Domain knowledge helps to pick good heuristics
- But maybe you don‘t know the domain ahead of time ...

- Local search parameters must be tuned
- Performance depends crucially on parameter setting

- Strong domain-specific heuristics in tree search
- New application/algorithm:
- Restart parameter tuning from scratch
- Waste of time both for researchers and practicioners

- Comparability
- Is algorithm A faster than algorithm B because they spent more time tuning it ?

Automated Parameter Setting

Motivation(2): operational scenario

- CP solver has to solve instances from a variety of domains
- Domains not known a priori
- Solver should automatically use best strategy for each instance
- Want to learn from instances we solve

Frank Hutter:

Frank Hutter:

Automated Parameter Setting

Overview

- Previous work on runtime prediction we base on[Leyton-Brown, Nudelman et al. ’02 & ’04]
- Part I: Automated parameter setting based on runtime prediction
- Part II: Incremental learning for runtime prediction in a priori unknown domains
- Experiments
- Conclusions

Automated Parameter Setting

Previous work on runtime prediction for algorithm selection

- General approach
- Portfolio of algorithms
- For each instance, choose the algorithm that promises to be fastest

- Examples
- [Lobjois and Lemaître, AAAI’98] CSP
- Mostly propagations of different complexity

- [Leyton-Brown et al., CP’02] Combinatorial auctions
- CPLEX + 2 other algorithms (which were thought incompetitive)

- [Nudelman et al., CP’04] SAT
- Many tree-search algorithms from last SAT competition

- [Lobjois and Lemaître, AAAI’98] CSP
- On average considerably faster than each single algorithm

Automated Parameter Setting

Runtime prediction: Basics (1 algorithm)[Leyton-Brown, Nudelman et al. ’02 & ’04]

- Training: Given a set of t instances z1,...,zt
- For each instance zi
- Compute features xi = (xi1,...,xim)
- Run algorithm to get its runtime yi

- Collect (xi ,yi) pairs
- Learn function f: X!R (features ! runtime), yi f (xi)

- For each instance zi
- Test: Given a new instance zt+1
- Compute features xt+1
- Predict runtime yt+1 = f(xt+1)

Expensive

Cheap

Automated Parameter Setting

Runtime prediction: Linear regression [Leyton-Brown, Nudelman et al. ’02 & ’04]

- The learned function f has to be linear in the features xi = (xi1,...,xim)
- yi¼ f(xi) = j=1..m (xij * wj) = xi * w
- The learning problem thus reduces to fitting the weights w =w1,...,wm

- To grasp the vast different in runtime better, estimate the logarithm of runtime: e.g. yi = 5 runtime is 105 sec

Automated Parameter Setting

Runtime prediction: Feature engineering [Leyton-Brown, Nudelman et al. ’02 & ’04]

- Features can be computed quickly (in seconds)
- Basic properties like #vars, #clauses, ratio
- Estimates of search space size
- Linear programming bounds
- Local search probes

- Linear functions are not very powerful
- But you can use the same methodology to learn more complex functions
- Let = (1,...,q) be arbitrary combinations of the features x1,...,xm (so-called basis functions)
- Learn linear function of basis functions: f() = * w

- Basis functions used in [Nudelman et al. ’04]
- Original features: xi
- Pairwise products of features: xi * xj
- Only subset of these (drop useless basis functions)

Automated Parameter Setting

Algorithm selection based on runtime prediction[Leyton-Brown, Nudelman et al. ’02 & ’04]

- Given n different algorithms A1,...,An
- Training:
- Learn n separate functions fj:!R, j=1...n

- Test:
- Predict runtime yjt+1 = fj(t+1) for each of the algorithms
- Choose algorithm Aj with minimal yjt+1

Really Expensive

Cheap

Automated Parameter Setting

Overview

- Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]
- Part I: Automated parameter setting based on runtime prediction
- Part II: Incremental learning for runtime prediction in a priori unknown domains
- Experiments
- Conclusions

Automated Parameter Setting

Parameter setting based on runtime prediction

Finding the best default parameter setting for a problem class

Generate special purpose code [Minton ’93]

Minimize estimated error [Kohavi & John ’95]

Racing algorithm [Birattari et al. ’02]

Local search [Hutter ’04]

Experimental design [Adenso-Daz & Laguna ’05]

Decision trees [Srivastava & Mediratta, ’05]

Runtime prediction for algorithm selection on a per-instance base

Predict runtime for each algorithm and pick the best [Leyton-Brown, Nudelman et al. ’02 & ’04]

Runtime prediction for setting parameterson a per-instance base

Automated Parameter Setting

Naive application of runtime prediction for parameter setting

- Given one algorithm with n different parameter settings P1,...,Pn
- Training:
- Learn n separate functions fj:!R, j=1...n

- Test:
- Predict runtime yjt+1 = fj(t+1) for each of the parameter settings
- Run algorithm with setting Pj with minimal yjt+1

Too expensive

Fairly Cheap

- If there are too many parameter configurations:
- Cannot run each parameter setting on each instance
- Need to generalize (cf. human parameter tuning)
- With separate functions there is no way to generalize

Automated Parameter Setting

X setting1:t

w1

w2

wn

y11:t

y21:t

yn1:t

w

X1:t

y11:t

y21:t

yn1:t

Generalization by parameter sharing- Naive approach: n separate functions.
- Information on theruntime of setting icannot inform predictions for setting j i

- Our approach: 1 single function.
- Information on theruntime of setting ican inform predictions for setting i j

Automated Parameter Setting

Application of runtime prediction for parameter setting setting

- View the parameters as additional features, learn a single function
- Training: Given a set of instances z1,...,zt
- For each instance zi
- Compute features xi
- Pick some parameter settings p1,...,pn
- Run algorithm with settings p1,...,pn to get runtimes y1i ,...,yni
- Basic functions 1i, ..., ni include the parameter settings

- Collect pairs (ji,yji) (n data points per instance)
- Only learn a single function g:!R

- For each instance zi
- Test: Given a new instance zt+1
- Compute features xt+1
- Search over parameter settings pj. Evaluation: compute jt+1, check g(jt+1)
- Run with best predicted parameter setting p*

Moderately Expensive

Cheap

Automated Parameter Setting

Summary of automated parameter setting based on runtime prediction

- Learn a single function that maps features and parameter settings to runtime
- Given a new instance
- Compute the features (they are fix)
- Search for the parameter setting that minimizes predicted runtime for these features

Automated Parameter Setting

Overview prediction

- Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]
- Part I: Automated parameter setting based on runtime prediction
- Part II: Incremental learning for runtime prediction in a priori unknown domains
- Experiments
- Conclusions

Automated Parameter Setting

Problem setting: Incremental learning for multiple domains prediction

Frank Hutter:

Frank Hutter:

Automated Parameter Setting

Solution: Sequential Bayesian Linear Regression prediction

Update “knowledge“ as new data arrives:probability distribution over weights w

- Incremental (one (xi, yi) pair at a time)
- Seemlessly integrate this new data
- “Optimal“: yields same result as a batch approach

- Efficient
- Computation: 1 matrix inversion per update
- Memory: can drop data we integrated

- Robust
- Simple to implement (3 lines of Matlab)
- Provides estimates of uncertainty in prediction

Automated Parameter Setting

What are uncertainty estimates? prediction

Automated Parameter Setting

Instead of predicting a single runtime y, use a predictionprobability distribution P(Y)

The mean of P(Y) is exactly the prediction of the non-Bayesian approach, but we get uncertainty estimates

Sequential Bayesian linear regression – intuitionUncertainty of prediction

P(Y)

Log. runtime Y

Mean predicted runtime

Automated Parameter Setting

Gaussian prediction

Assumed Gaussian

Gaussian

Sequential Bayesian linear regression – technical- Standard linear regression:
- Training: given training data 1:n, y1:n, fit the weights w such that y1:n¼1:n* w
- Prediction: yn+1 = n+1 * w

- Bayesian linear regression:
- Training: Given training data 1:n, y1:n, infer probability distribution P(w|1:n, y1:n) / P(w) * i P(yi|i, w)

- Prediction: P(yn+1|n+1, 1:n, y1:n) = sP(yn+1|w, n+1) * P(w|1:n, y1:n) dw

- “Knowledge“ about the weights: Gaussian (w, w)

Automated Parameter Setting

Start with a prior P( predictionw)with very high uncertainty

First data point (1,y1)

P(w|1, y1) /P(w) * P(y1|1,w)

Prediction with prior w

Prediction with posterior w|1, y1

P(y2|,w)

P(y2|,w)

Log. runtime y2

Log. runtime y2

Sequential Bayesian linear regression – visualizedP(wi)

Weight wi

P(y1|1,w)

Weight wi

P(wi|1, y1)

Automated Parameter Setting

Summary of incremental learning for runtime prediction prediction

- Have a probability distribution over the weights:
- Start with a Gaussian prior, incremetally update it with more data

- Given the Gaussian weight distribution, the predictions are also Gaussians
- We know how uncertain our predictions are
- For new domains, we will be very uncertain and only grow more confident after having seen a couple of data points

Frank Hutter:

Frank Hutter:

Automated Parameter Setting

Overview prediction

- Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]
- Part I: Automated parameter setting based on runtime prediction
- Part II: Incremental learning for runtime prediction in a priori unknown domains
- Experiments
- Conclusions

Automated Parameter Setting

Domain for our experiments prediction

- SAT
- Best studied NP-hard problem
- Good features already exist [Nudelman et al.’04]
- Lots of benchmarks

- Stochastic Local Search (SLS)
- Runtime prediction has never been done for SLS before
- Parameter tuning is very important for SLS
- Parameters are often continuous

- SAPS algorithm [Hutter, Tompkins, Hoos ‘02]
- Still amongst the state-of-the-art
- Default setting not always best
- Well, I also know it well ;-)

- But the approach is applicable to about anything whenever we can compute features!!

Automated Parameter Setting

Stochastic Local Search for SAT: predictionScaling and Probabilistic Smoothing (SAPS)[Hutter, Tompkins, Hoos ‘02]

- Clause weighting algorithm for SAT, was state-of-the-art in 2002
- Start with all clause weights set to 1
- Hillclimbing until you hit a local minimum
- In local minima:
- Scaling: scale weights of unsatisfied clauses: wcÃ * wc
- Probabilistic smoothing: with probability Psmooth, smooth all clause weights: wcÃ * wc + (1-) * average wc

- Default parameter setting: (, , Psmooth) = (1.3,0.8,0.05)
- Psmooth and are very closely related

Automated Parameter Setting

Benchmark instances prediction

- Only satisfiable instances!
- SAT04rand: SAT ‘04 competition instances
- mix: mix of lots of different domains from SATLIB: random, graph colouring, blocksworld, inductive inference, logistics, ...

Automated Parameter Setting

Adaptive parameter setting vs. SAPS default on predictionSAT04rand

- Trained on mix and used to choose parameters for SAT04rand
- 2 {0.5,0.6,0.7,0.8}
- 2 {1.1,1.2,1.3}
- For SAPS: #steps time
- Adaptive variant on average 2.5 times faster than default
- But default is not strong here

Automated Parameter Setting

Where uncertainty helps in practice: predictionqualitative differences in training & test set

- Trained on mix, tested on SAT04rand

Estimates of uncertaintyof prediction

Optimal prediction

Automated Parameter Setting

Where uncertainty helps in practice (2): predictionZoomed to predictions with low uncertainty

Optimal prediction

Automated Parameter Setting

Overview prediction

- Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]
- Part I: Automated parameter setting based on runtime prediction
- Part II: Incremental learning for runtime prediction in a priori unknown domains
- Experiments
- Conclusions

Automated Parameter Setting

Conclusions prediction

- Automated parameter tuning is needed and feasible
- Algorithm experts waste their time on it
- Solver can automatically choose appropriate heuristics based on instance characteristics

- Such a solver could be used in practice
- Learns incrementally from the instances it solves
- Uncertainty estimates prevent catastrophic errors in estimates for new domains

Automated Parameter Setting

Future work along these lines prediction

- Increase predictive performance
- Better features
- More powerful ML algorithms

- Active learning
- Run most informative probes for new domains (need the uncertainty estimates)

- Use uncertainty
- Pick algorithm with maximal probability of success (not the one with minimal expected runtime!)

- More domains
- Tree search algorithms
- CP

Automated Parameter Setting

Future work along related lines prediction

- If there are no features:
- Local search in parameter space to find the best default parameter setting [Hutter ‘04]

- If we can change strategies while running the algorithm:
- Reinforment learning for algorithm selection[Lagoudakis & Littman ‘00]
- Low knowledge algorithm control[Carchrae and Beck ‘05]

Automated Parameter Setting

The End prediction

- Thanks to
- Youssef Hamadi
- Kevin Leyton-Brown
- Eugene Nudelman
- You for your attention

Automated Parameter Setting

Download Presentation

Connecting to Server..