1 / 58

Why Minimax Works: An Alternative Explanation

Why Minimax Works: An Alternative Explanation. Mitja Luštrek 1 , Ivan Bratko 2 and Matjaž Gams 1 1 Jožef Stefan Institute, Department of Intelligent Systems 2 University of Ljubljana, Faculty of Computer and Information Science. Plan of talk. Game tree search and minimax pathology

vhartman
Download Presentation

Why Minimax Works: An Alternative Explanation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why Minimax Works:An Alternative Explanation Mitja Luštrek 1, Ivan Bratko 2 and Matjaž Gams 1 1 Jožef Stefan Institute, Department of Intelligent Systems 2 University of Ljubljana, Faculty of Computer and Information Science

  2. Plan of talk • Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

  3. Search in game trees Us to move Them to move

  4. Search in game trees v = max( v1, v2) = true value + error Back up v1 v2 Evaluate heuristically with some error

  5. Searching deeper Root value now more trustworthy? Back up Back up Back up ........ Back up ..... Evaluate heuristically here

  6. The minimax pathology • Conventional wisdom in practice of game-playing: the deeper a program searches, the better it plays. • Early mathematical analyses of minimax [Nau 1979; Beal 1980] - some surprizing results • According to theoretical model, minimax SHOULD NOT work! • Minimaxing amplifies the error of the heuristic evaluation function • The deeper a game-playing program searches, the worse it plays! • Nau: “Performing worse by working harder!” • This is called the minimax pathology

  7. Problems with these theoretical analyses • General impression: Something must have been wrong with these analyses! • Were the assumptions made by these mathematical models realistic?

  8. Beal’s early assumptions • Uniform branching factor • Position values are binary: loss or win • Proportion of wins for side to move is constant throughout game tree • Position values within a level are independent of each other • Static heuristic evaluation error is independent of the depth of node Note: None of these looks very unrealistic

  9. Modifying Beal’s assumptions • Subsequent analyses by various authors: modify the assumptions so that pathology disappeared

  10. Successful attempts to explain the pathology • Pathology disappeared when assuming: • Positions close to each other have similar values [Bratko and Gams 1982; Beal 1982; Nau 1982; Scheucher and Kaindl 1998; Luštrek 2004 • Early terminations [Pearl 1983] • Geometrically distributed branching factor [Michon 1983] • Which explanation is most natural? • Which conditions for the absence of the pathology are really necessary?

  11. This paper: an alternative explanation • Is there a more fundamental explanation, one that makes “the least assumption”? Is there something fundamental about the minimax relation that makes minimaxing successful in practice? • This paper: Yes, there is! • It has to do with Beal’s assumption 5 • This paper: estimate positions by real values (as game playing programs do). Surprisingly, then Beal’s assumption 5 is not tenable!

  12. Two-value and real-value errors • Two-value error loss win • Real-value error 0.32 0.41

  13. Beal’s assumption 5: P( two-value heuristic error) constant with level Our assumption: real-value heuristic error distrib. constant with level P( binary error) Real-value error noise Depth

  14. Summary of this paper’s findings • Our assumption: real-value error distribution at bottom level of search is constant throughout game tree • Beal’s assumption 5: two-value error distribution at bottom level of search is constant throughout game tree • These assumptions look equivalent, BUT surprisingly they are NOT! • These assumptions are in fact INCOMPATIBLE: minimax relation between true values of positions in game tree does not permit both two-value error and real-value error to be constant

  15. Summary of this paper’s findings • When real-value heuristic error is constant, backed-up heuristic values become more reliable with increased depth of search - i.e. no pathology! • Corresponding backed-up binary values also become more reliable with depth (binary values obtained from real values through thresholding)

  16. MITJA TO CONTINUE

  17. Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

  18. Why multiple/real values? • Neccessary in games where the final outcome is multivalued (Othello, tarok). • Used by humans and game-playing programs. • Seem unnecessary in games where the outcome is a loss, a win or perhaps a draw (chess, checkers). • But: • in a losing position against a fallible and unknown opponent, the outcome is uncertain; • in a winning positon, a perfect two-valued evaluation function will not lose, but it may never win, either. • Multiple values are required to model uncertainty and to maintain a direction of play towards an eventual win.

  19. A real-valued minimax model Aims to ba a real-valued version of Beal’s model. • Uniform branching factor; • position values are real numbers; • if the real values are converted to losses and wins, the proportion of losses for the side to move is constant throughout game tree; • position values within a level are independent of each other; • static heuristic evaluation error is independent of the depth of node (the error is normally distributed noise).

  20. Building of a game tree

  21. Building of a game tree True values distributed uniformly in [0, 1]

  22. Building of a game tree True values backed up

  23. Building of a game tree True values backed up

  24. Building of a game tree True values backed up

  25. Building of a game tree True values backed up

  26. Building of a game tree Search to this depth

  27. Building of a game tree Heuristic values = true values + normally distributed noise

  28. Building of a game tree Heuristic values backed up

  29. Building of a game tree Heuristic values backed up

  30. Building of a game tree Heuristic values backed up

  31. What we do with our model • Monte Carlo experiments: • generate 10,000 sets of true values; • generate 10 sets of heuristic values per set of true values per depth of search. • Measure the error at the root: • real-value error = the average difference between the true value and the heuristic value; • two-value error = the frequency of mistaking a loss for a win or vice versa. • Compare the error at the root when searching to different depths.

  32. Conversion of real values to losses and wins (1) • To measure two-value error, real values must be converted to losses and wins. • Value above a threshold means win, below the threshold loss. • At the leaves: • the proportion of losses for the side to move = cb (because it must be the same at all levels); • real values distributed uniformly in [0, 1]; • therefore threshold = cb. • At higher levels: • minimaxing on real values is equivalent to minimaxing on two values; • therefore also threshold = cb.

  33. Conversion of real values to losses and wins (2) Real values Two values

  34. Conversion of real values to losses and wins (2) Real values Two values Minimaxing

  35. Conversion of real values to losses and wins (2) Real values Two values Minimaxing

  36. Conversion of real values to losses and wins (2) Real values Two values Apply threshold

  37. Conversion of real values to losses and wins (2) Real values Two values

  38. Conversion of real values to losses and wins (2) Real values Two values

  39. Conversion of real values to losses and wins (2) Real values Two values Apply threshold

  40. Conversion of real values to losses and wins (2) Real values Two values Minimaxing

  41. Conversion of real values to losses and wins (2) Real values Two values Minimaxing

  42. Conversion of real values to losses and wins (2) Real values Two values

  43. Error at the root / constant real-value error • Plotted: real-value and two-value error at the root. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.

  44. Error at the bottom / constant real-value error • Plotted: two-value error at the lowest level of search. • Real-value error at the lowest level of search: normally distributed noise with standard deviaiton 0.1.

  45. Error at the bottom / constant two-value error • Plotted: real-value error at the lowest level of search. • Two-value error at the lowest level of search: 0.1.

  46. Error at the root / constant two-value error • Plotted: two-value error at the root in our real-value model and in Beal’s model. • Two-value error at the lowest level of search: 0.1. • After a small tweak of Beal’s model, we get a perfect match.

  47. Conclusions from the graphs • Real-value error at the lowest level of search is constant: • two-value error at the lowest level of search decreases with the depth of search; • no pathology. • Two-value error at the lowest level of search is constant: • real-value error at the lowest level of search increases with the depth of search; • pathology. • What is right?

  48. Game tree search and minimax pathology • A Real-Valued Minimax Model • Explanation

  49. Should real or two values be constant? (1) • Already explained why real values are necessary. • Real-value error most naturally represent the fallibility of the heuristic evaluation function. • Game playing programs do not use two-valued evaluation functions, but if they did: • they would more often make a mistake in uncertain positions close to the threshold; • they would rarely make a mistake in certain positions far from the threshold.

  50. Should real or two values be constant? (2)

More Related