1 / 71

CS 4700: Foundations of Artificial Intelligence

This module discusses the use of randomization in complete tree search algorithms, such as randomized strategies in local search, introducing randomness in variable and value selection, and the limitations of local search methods. It also covers topics like heavy-tailed distributions and their use in modeling real-world phenomena.

robertss
Download Presentation

CS 4700: Foundations of Artificial Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 4700:Foundations of Artificial Intelligence Carla P. Gomes gomes@cs.cornell.edu Module: Randomization in Complete Tree Search Algorithms Wrap-up of Search!

  2. Randomization in Local Search • Randomized strategies are very successful in the area of local search. • Random Hill Climbing • Simulated annealing • Genetic algorithms • Tabu Search • Gsat and variants. • Key Limitation? Inherent incomplete nature of local search methods.

  3. Randomization in Tree Search • Introduce randomness in a tree search method e.g., by randomlybreaking ties in variable and/or value selection. • Why would we do that? Can we also add a stochastic element to a systematic (tree search) procedure without losing completeness?

  4. Backtrack Search ( aOR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

  5. Backtrack Search Two Different Executions ( aOR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

  6. The fringe of the search space The fringe of search space

  7. Time: 7 11 30 (*) (*) (*) no solution found - reached cutoff: 2000 Latin Square Completion:Randomized Backtrack Search Easy instance – 15 % pre-assigned cells Gomes et al. 97

  8. 2000 500 Erratic Mean Behavior 3500! sample mean Median = 1! number of runs (on the same instance)

  9. 1

  10. Number backtracks Number backtracks 75%<=30 5%>100000 Proportion of cases Solved F(x)

  11. Run Time Distributions • The runtime distributions of some of the instances reveal interesting properties: • I Erratic behavior of mean. • II Distributions have “heavy tails”.

  12. Heavy-Tailed Distributions • … infinite variance … infinite mean • Introduced by Pareto in the 1920’s • --- “probabilistic curiosity.” • Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. • Examples: stock-market, earth-quakes, weather,...

  13. Decay of Distributions • Standard --- Exponential Decay • e.g. Normal: • Heavy-Tailed --- Power Law Decay • e.g. Pareto-Levy:

  14. Levi -Power law Decay Cauchy -Power law Decay Normal - Exponential Decay Normal, Cauchy, and Levy

  15. Tail Probabilities (Standard Normal, Cauchy, Levy)

  16. Normal distribution  kurtosis is 3 Fat tailed distribution  when kurtosis > 3 (e.g., exponential, lognormal) second central moment (i.e., variance) fourth central moment Fat tailed distributions • Kurtosis =

  17. Fat and Heavy-tailed distributions Exponential decay for standard distributions, e.g. Normal, Logonormal, exponential: Normal Heavy-Tailed Power Law Decay e.g. Pareto-Levy:

  18. Pareto Distribution • where > 0 is a shape parameter • Density Function f(x) = P[ X = x ] • f(x) =  / x( + 1) for x 1 • Distribution Function F(x) = P[ X  x ] • F(x) = 1 - 1 / x for x 1 • Survival Function (Tail probability S(x) = 1 – F(x) = P[X>x] • S(x) = 1 / x for x 1

  19. Pareto Distribution • Moments E(Xn) =  / ( - n) if n <  E(Xn) =   if n. Mean  E(X) =  / ( - 1) if  > 1. E(X) =  if  1. Variance var(X) =  / [( - 1)2( - 2)] if  > 2 var(X) =  if  2.

  20. How to Check for “Heavy Tails”? • Power-law decay of tail • Log-Log plot of tail of distribution (Survival function or 1-F(x): e.g for the Pareto S(x) = 1 / x for x 1 ) • should be approximately linear. • Slope gives value of • infinite mean and infinite variance • infinite variance

  21. Pareto =1Lognormal 1,1 Lognormal(1,1) Pareto(1) f(x) X Infinite mean and infinite variance.

  22. How to Visually Check for Heavy-Tailed Behavior Log-log plot of tail of distribution exhibits linear behavior.

  23. Survival Function:Pareto and Lognormal

  24. Example of Heavy Tailed Model • Random Walk: • Start at position 0 • Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0.

  25. Long periods without zero crossing Zero crossing The record of 10,000 tosses of an ideal coin (Feller)

  26. 50% Random Walk Median=2 2 Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) 1-F(x) Unsolved fraction O,1%>200000 Normal (2,1) X - number of steps the walk takes to return to zero (log scale)

  27. 18% unsolved 0.002% unsolved => Infinite mean Heavy-Tailed Behavior in Latin Square Completion Problem (1-F(x))(log) Unsolved fraction Number backtracks (log)

  28. Walsh 99 How Toby Walsh Fried his PC(Graph Coloring)

  29. To Be or Not To Be • Heavy-Tailed

  30. Random Binary CSP Models Model E <N, D, p> N – number of variables; D – size of the domains: p – proportion of forbidden pairs (out of D2N ( N-1)/ 2) N – from 15 to 50; (Achlioptas et al 2000)

  31. Typical Case Analysis: Model E Phase Transition Phenomenon: Discriminating “easy” vs. “hard” instances % of solvable instances Computational Cost (Mean) Constrainedness Hogg et al 96

  32. Runtime distributions

  33. Towards phase transition

  34. Explaining and Exploiting Fat and Heavy-Tailed

  35. Backdoors Hidden tractable substructure in real-world problems subset of the “critical” variables such that once assigned a value the instance simplifies to a tractable class practical consequences How to explain short runs? Heavy/Fat Tails – wide range of solution times very short and very long runtimes Formal Models of Heavy and Fat Tails in Combinatorial Search

  36. Aftersetting 5 backdoor vars Aftersetting 12 backdoor vars Initial Constraint Graph Logistics planning problem formula- 843 vars, 7,301 constraints – 16 backdoor variables (visualization by Anand Kapur, 4701 project) Logistics Planning – instances with O(log(n)) backdoors

  37. Exploiting Backdoors

  38. Algorithms • Three kinds of strategies for dealing with backdoors: Acomplete backtrack-search deterministicalgorithm Acomplete randomized backtrack-searchalgorithm Provably better performance over the deterministic one A heuristicly guided complete randomized backtrack-search algorithm Assumes existence of a good heuristic for choosing variables to branch on We believe this is close to what happens in practice Williams, Gomes, Selman 03/04

  39. Deterministic Generalized Iterative Deepening

  40. x1 = 0 x2 = 0 xn = 0 x1 = 1 x2 = 1 xn = 1 Generalized Iterative Deepening (…) All possible trees of depth 1

  41. x2 = 1 x2 = 0 x2 = 1 x2 = 0 Generalized Iterative Deepening Level 2 x1 = 0 x1 = 1 All possible trees of depth 2

  42. xn = 1 xn = 0 xn= 1 xn = 0 Generalized Iterative Deepening Level 2 xn-1 = 0 Xn-1 = 1 All possible trees of depth 2 Level 3, level 4, and so on …

  43. Randomized Generalized Iterative Deepening Assumption: There exists a backdoor whose size is bounded by a function of n (call it B(n)) Idea: Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor

  44. Det. algorithm outperforms brute-force search for k > 4.2 Deterministic Versus Randomized Suppose variables have 2 possible values (e.g. SAT) For B(n) = n/k, algorithm runtime is cn c Deterministic strategy Randomized strategy k

  45. Complete Randomized Depth First Search with Heuristic • Assume we have the following. • DFS, a generic depth first search randomized • backtrack search solver with: • (polytime) sub-solverA • Heuristic Hthat (randomly) chooses variables to branch on, in polynomial time • Hhas probability 1/h of choosing a • backdoor variable (h is a fixed constant) • Call this ensemble (DFS, H, A)

  46. Polytime Restart Strategy for(DFS, H, A) • Essentially: • If there is a small backdoor, then(DFS, H, A) has a restart strategy that runs in polytime.

  47. Runtime Table for Algorithms DFS,H,A B(n) = upper bound on the size of a backdoor, given n variables When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and deterministic algorithm Williams, Gomes, Selman 03/04

  48. How to avoid the long runs in practice? Userestarts or parallel / interleavedruns to exploit the extreme variance performance. Restartsprovablyeliminate heavy-tailed behavior.

  49. Restarts 70% unsolved no restarts 1-F(x) Unsolved fraction restart every 4 backtracks 0.001% unsolved 250 (62 restarts) Number backtracks (log)

  50. 100000 ~10 restarts ~100 restarts 2000 20 Example of Rapid Restart Speedup(planning) Number backtracks (log) Cutoff (log)

More Related