Automated parameter setting based on runtime prediction
Download
1 / 35

Automated Parameter Setting Based on Runtime Prediction: - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Automated Parameter Setting Based on Runtime Prediction:. Towards an Instance-Aware Problem Solver. Frank Hutter, Univ. of British Columbia, Vancouver, Canada Youssef Hamadi, Microsoft Research, Cambridge, UK. Motivation(1): Why automated parameter setting ?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Automated Parameter Setting Based on Runtime Prediction:' - yoshi-walls


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Automated parameter setting based on runtime prediction

Automated Parameter Setting Based on Runtime Prediction:

Towards an Instance-Aware Problem Solver

Frank Hutter, Univ. of British Columbia, Vancouver, Canada

Youssef Hamadi, Microsoft Research, Cambridge, UK


Motivation 1 why automated parameter setting
Motivation(1): Why automated parameter setting ?

  • We want to use the best available heuristic for a problem

    • Strong domain-specific heuristics in tree search

      • Domain knowledge helps to pick good heuristics

      • But maybe you don‘t know the domain ahead of time ...

    • Local search parameters must be tuned

      • Performance depends crucially on parameter setting

  • New application/algorithm:

    • Restart parameter tuning from scratch

    • Waste of time both for researchers and practicioners

  • Comparability

    • Is algorithm A faster than algorithm B because they spent more time tuning it ? 

Automated Parameter Setting


Motivation 2 operational scenario
Motivation(2): operational scenario

  • CP solver has to solve instances from a variety of domains

  • Domains not known a priori

  • Solver should automatically use best strategy for each instance

  • Want to learn from instances we solve

Frank Hutter:

Frank Hutter:

Automated Parameter Setting


Overview
Overview

  • Previous work on runtime prediction we base on[Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Part I: Automated parameter setting based on runtime prediction

  • Part II: Incremental learning for runtime prediction in a priori unknown domains

  • Experiments

  • Conclusions

Automated Parameter Setting


Previous work on runtime prediction for algorithm selection
Previous work on runtime prediction for algorithm selection

  • General approach

    • Portfolio of algorithms

    • For each instance, choose the algorithm that promises to be fastest

  • Examples

    • [Lobjois and Lemaître, AAAI’98] CSP

      • Mostly propagations of different complexity

    • [Leyton-Brown et al., CP’02] Combinatorial auctions

      • CPLEX + 2 other algorithms (which were thought incompetitive)

    • [Nudelman et al., CP’04] SAT

      • Many tree-search algorithms from last SAT competition

  • On average considerably faster than each single algorithm

Automated Parameter Setting


Runtime prediction basics 1 algorithm leyton brown nudelman et al 02 04
Runtime prediction: Basics (1 algorithm)[Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Training: Given a set of t instances z1,...,zt

    • For each instance zi

      • Compute features xi = (xi1,...,xim)

      • Run algorithm to get its runtime yi

    • Collect (xi ,yi) pairs

    • Learn function f: X!R (features ! runtime), yi  f (xi)

  • Test: Given a new instance zt+1

    • Compute features xt+1

    • Predict runtime yt+1 = f(xt+1)

Expensive

Cheap

Automated Parameter Setting


Runtime prediction linear regression leyton brown nudelman et al 02 04
Runtime prediction: Linear regression [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • The learned function f has to be linear in the features xi = (xi1,...,xim)

    • yi¼ f(xi) = j=1..m (xij * wj) = xi * w

    • The learning problem thus reduces to fitting the weights w =w1,...,wm

  • To grasp the vast different in runtime better, estimate the logarithm of runtime: e.g. yi = 5  runtime is 105 sec

Automated Parameter Setting


Runtime prediction feature engineering leyton brown nudelman et al 02 04
Runtime prediction: Feature engineering [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Features can be computed quickly (in seconds)

    • Basic properties like #vars, #clauses, ratio

    • Estimates of search space size

    • Linear programming bounds

    • Local search probes

  • Linear functions are not very powerful

  • But you can use the same methodology to learn more complex functions

    • Let  = (1,...,q) be arbitrary combinations of the features x1,...,xm (so-called basis functions)

    • Learn linear function of basis functions: f() =  * w

  • Basis functions used in [Nudelman et al. ’04]

    • Original features: xi

    • Pairwise products of features: xi * xj

    • Only subset of these (drop useless basis functions)

Automated Parameter Setting


Algorithm selection based on runtime prediction leyton brown nudelman et al 02 04
Algorithm selection based on runtime prediction[Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Given n different algorithms A1,...,An

  • Training:

    • Learn n separate functions fj:!R, j=1...n

  • Test:

    • Predict runtime yjt+1 = fj(t+1) for each of the algorithms

    • Choose algorithm Aj with minimal yjt+1

Really Expensive

Cheap

Automated Parameter Setting


Overview1
Overview

  • Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Part I: Automated parameter setting based on runtime prediction

  • Part II: Incremental learning for runtime prediction in a priori unknown domains

  • Experiments

  • Conclusions

Automated Parameter Setting


Parameter setting based on runtime prediction
Parameter setting based on runtime prediction

Finding the best default parameter setting for a problem class

Generate special purpose code [Minton ’93]

Minimize estimated error [Kohavi & John ’95]

Racing algorithm [Birattari et al. ’02]

Local search [Hutter ’04]

Experimental design [Adenso-Daz & Laguna ’05]

Decision trees [Srivastava & Mediratta, ’05]

Runtime prediction for algorithm selection on a per-instance base

Predict runtime for each algorithm and pick the best [Leyton-Brown, Nudelman et al. ’02 & ’04]

Runtime prediction for setting parameterson a per-instance base

Automated Parameter Setting


Naive application of runtime prediction for parameter setting
Naive application of runtime prediction for parameter setting

  • Given one algorithm with n different parameter settings P1,...,Pn

  • Training:

    • Learn n separate functions fj:!R, j=1...n

  • Test:

    • Predict runtime yjt+1 = fj(t+1) for each of the parameter settings

    • Run algorithm with setting Pj with minimal yjt+1

Too expensive

Fairly Cheap

  • If there are too many parameter configurations:

    • Cannot run each parameter setting on each instance

    • Need to generalize (cf. human parameter tuning)

    • With separate functions there is no way to generalize

Automated Parameter Setting


Generalization by parameter sharing

X setting1:t

w1

w2

wn

y11:t

y21:t

yn1:t

w

X1:t

y11:t

y21:t

yn1:t

Generalization by parameter sharing

  • Naive approach: n separate functions.

  • Information on theruntime of setting icannot inform predictions for setting j i

  • Our approach: 1 single function.

  • Information on theruntime of setting ican inform predictions for setting i j

Automated Parameter Setting


Application of runtime prediction for parameter setting
Application of runtime prediction for parameter setting setting

  • View the parameters as additional features, learn a single function

  • Training: Given a set of instances z1,...,zt

    • For each instance zi

      • Compute features xi

      • Pick some parameter settings p1,...,pn

      • Run algorithm with settings p1,...,pn to get runtimes y1i ,...,yni

      • Basic functions 1i, ..., ni include the parameter settings

    • Collect pairs (ji,yji) (n data points per instance)

    • Only learn a single function g:!R

  • Test: Given a new instance zt+1

    • Compute features xt+1

    • Search over parameter settings pj. Evaluation: compute jt+1, check g(jt+1)

    • Run with best predicted parameter setting p*

Moderately Expensive

Cheap

Automated Parameter Setting


Summary of automated parameter setting based on runtime prediction
Summary of automated parameter setting based on runtime prediction

  • Learn a single function that maps features and parameter settings to runtime

  • Given a new instance

    • Compute the features (they are fix)

    • Search for the parameter setting that minimizes predicted runtime for these features

Automated Parameter Setting


Overview2
Overview prediction

  • Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Part I: Automated parameter setting based on runtime prediction

  • Part II: Incremental learning for runtime prediction in a priori unknown domains

  • Experiments

  • Conclusions

Automated Parameter Setting


Problem setting incremental learning for multiple domains
Problem setting: Incremental learning for multiple domains prediction

Frank Hutter:

Frank Hutter:

Automated Parameter Setting


Solution sequential bayesian linear regression
Solution: Sequential Bayesian Linear Regression prediction

Update “knowledge“ as new data arrives:probability distribution over weights w

  • Incremental (one (xi, yi) pair at a time)

    • Seemlessly integrate this new data

    • “Optimal“: yields same result as a batch approach

  • Efficient

    • Computation: 1 matrix inversion per update

    • Memory: can drop data we integrated

  • Robust

    • Simple to implement (3 lines of Matlab)

    • Provides estimates of uncertainty in prediction

Automated Parameter Setting


What are uncertainty estimates
What are uncertainty estimates? prediction

Automated Parameter Setting


Sequential bayesian linear regression intuition

Instead of predicting a single runtime y, use a predictionprobability distribution P(Y)

The mean of P(Y) is exactly the prediction of the non-Bayesian approach, but we get uncertainty estimates

Sequential Bayesian linear regression – intuition

Uncertainty of prediction

P(Y)

Log. runtime Y

Mean predicted runtime

Automated Parameter Setting


Sequential bayesian linear regression technical

Gaussian prediction

Assumed Gaussian

Gaussian

Sequential Bayesian linear regression – technical

  • Standard linear regression:

    • Training: given training data 1:n, y1:n, fit the weights w such that y1:n¼1:n* w

    • Prediction: yn+1 = n+1 * w

  • Bayesian linear regression:

    • Training: Given training data 1:n, y1:n, infer probability distribution P(w|1:n, y1:n) / P(w) * i P(yi|i, w)

  • Prediction: P(yn+1|n+1, 1:n, y1:n) = sP(yn+1|w, n+1) * P(w|1:n, y1:n) dw

  • “Knowledge“ about the weights: Gaussian (w, w)

Automated Parameter Setting


Sequential bayesian linear regression visualized

Start with a prior P( predictionw)with very high uncertainty

First data point (1,y1)

P(w|1, y1) /P(w) * P(y1|1,w)

Prediction with prior w

Prediction with posterior w|1, y1

P(y2|,w)

P(y2|,w)

Log. runtime y2

Log. runtime y2

Sequential Bayesian linear regression – visualized

P(wi)

Weight wi

P(y1|1,w)

Weight wi

P(wi|1, y1)

Automated Parameter Setting


Summary of incremental learning for runtime prediction
Summary of incremental learning for runtime prediction prediction

  • Have a probability distribution over the weights:

    • Start with a Gaussian prior, incremetally update it with more data

  • Given the Gaussian weight distribution, the predictions are also Gaussians

    • We know how uncertain our predictions are

    • For new domains, we will be very uncertain and only grow more confident after having seen a couple of data points

Frank Hutter:

Frank Hutter:

Automated Parameter Setting


Overview3
Overview prediction

  • Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Part I: Automated parameter setting based on runtime prediction

  • Part II: Incremental learning for runtime prediction in a priori unknown domains

  • Experiments

  • Conclusions

Automated Parameter Setting


Domain for our experiments
Domain for our experiments prediction

  • SAT

    • Best studied NP-hard problem

    • Good features already exist [Nudelman et al.’04]

    • Lots of benchmarks

  • Stochastic Local Search (SLS)

    • Runtime prediction has never been done for SLS before

    • Parameter tuning is very important for SLS

    • Parameters are often continuous

  • SAPS algorithm [Hutter, Tompkins, Hoos ‘02]

    • Still amongst the state-of-the-art

    • Default setting not always best

    • Well, I also know it well ;-)

  • But the approach is applicable to about anything whenever we can compute features!!

Automated Parameter Setting


Stochastic local search for sat scaling and probabilistic smoothing saps hutter tompkins hoos 02
Stochastic Local Search for SAT: predictionScaling and Probabilistic Smoothing (SAPS)[Hutter, Tompkins, Hoos ‘02]

  • Clause weighting algorithm for SAT, was state-of-the-art in 2002

    • Start with all clause weights set to 1

    • Hillclimbing until you hit a local minimum

    • In local minima:

      • Scaling: scale weights of unsatisfied clauses: wcÃ * wc

      • Probabilistic smoothing: with probability Psmooth, smooth all clause weights: wcÃ * wc + (1-) * average wc

  • Default parameter setting: (, , Psmooth) = (1.3,0.8,0.05)

  • Psmooth and  are very closely related

Automated Parameter Setting


Benchmark instances
Benchmark instances prediction

  • Only satisfiable instances!

  • SAT04rand: SAT ‘04 competition instances

  • mix: mix of lots of different domains from SATLIB: random, graph colouring, blocksworld, inductive inference, logistics, ...

Automated Parameter Setting


Adaptive parameter setting vs saps default on sat04rand
Adaptive parameter setting vs. SAPS default on predictionSAT04rand

  • Trained on mix and used to choose parameters for SAT04rand

  • 2 {0.5,0.6,0.7,0.8}

  • 2 {1.1,1.2,1.3}

  • For SAPS: #steps  time

  • Adaptive variant on average 2.5 times faster than default

    • But default is not strong here

Automated Parameter Setting


Where uncertainty helps in practice qualitative differences in training test set
Where uncertainty helps in practice: predictionqualitative differences in training & test set

  • Trained on mix, tested on SAT04rand

Estimates of uncertaintyof prediction

Optimal prediction

Automated Parameter Setting


Where uncertainty helps in practice 2 zoomed to predictions with low uncertainty
Where uncertainty helps in practice (2): predictionZoomed to predictions with low uncertainty

Optimal prediction

Automated Parameter Setting


Overview4
Overview prediction

  • Previous work on runtime prediction we base on [Leyton-Brown, Nudelman et al. ’02 & ’04]

  • Part I: Automated parameter setting based on runtime prediction

  • Part II: Incremental learning for runtime prediction in a priori unknown domains

  • Experiments

  • Conclusions

Automated Parameter Setting


Conclusions
Conclusions prediction

  • Automated parameter tuning is needed and feasible

    • Algorithm experts waste their time on it

    • Solver can automatically choose appropriate heuristics based on instance characteristics

  • Such a solver could be used in practice

    • Learns incrementally from the instances it solves

    • Uncertainty estimates prevent catastrophic errors in estimates for new domains

Automated Parameter Setting


Future work along these lines
Future work along these lines prediction

  • Increase predictive performance

    • Better features

    • More powerful ML algorithms

  • Active learning

    • Run most informative probes for new domains (need the uncertainty estimates)

  • Use uncertainty

    • Pick algorithm with maximal probability of success (not the one with minimal expected runtime!)

  • More domains

    • Tree search algorithms

    • CP

Automated Parameter Setting


Future work along related lines
Future work along related lines prediction

  • If there are no features:

    • Local search in parameter space to find the best default parameter setting [Hutter ‘04]

  • If we can change strategies while running the algorithm:

    • Reinforment learning for algorithm selection[Lagoudakis & Littman ‘00]

    • Low knowledge algorithm control[Carchrae and Beck ‘05]

Automated Parameter Setting


The end
The End prediction

  • Thanks to

    • Youssef Hamadi

    • Kevin Leyton-Brown

    • Eugene Nudelman

    • You for your attention 

Automated Parameter Setting


ad