1 / 78

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter 5.3-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Read Chapter 6.1-6.4, except 6.3.3. Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Download Presentation

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter 5.3-5.5 Next lecture topic: Constraint Satisfaction Problems (two lectures) Read Chapter 6.1-6.4, except 6.3.3 Game-Playing & Adversarial SearchAlpha-Beta Pruning, etc. • (Please read lecture topic material before and after each lecture on that topic)

  2. You Will Be Expected to Know • Alpha-beta pruning (5.3) • Expectiminimax (5.5)

  3. Review of Previous Lecture • Basic definitions (section 5.1) • Minimax optimal game search (5.2) • Evaluation functions (5.4.1) • Cutting off search (5.4.2)

  4. Alpha-Beta PruningExploiting the Fact of an Adversary • If a position is provably bad: • It is NO USE searching to find out exactly how bad • If the adversary can force a bad position: • It is NO USE searching to find the good positions the adversary won’t let you achieve anyway • Bad = not better than we can get elsewhere.

  5. Tic-Tac-Toe Example with Alpha-Beta Pruning Do these nodes matter? If they = +1 million? If they = −1 million?

  6. Extended Alpha-Beta Example“Possible Interval” View: (min, max) Initially, interval is unknown (−,+): Range of possible values (−∞,+∞)

  7. Extended Alpha-Beta Example“Possible Interval” View Do DF-search until first leaf: Range of possible values (−∞,+∞) Child inherits current interval (−∞, +∞)

  8. Extended Alpha-Beta Example“Possible Interval” View See first leaf, update possible interval: Range of possible values (−∞,+∞) (−∞,3]

  9. Extended Alpha-Beta Example“Possible Interval” View See remaining leaves, and outcome is known: (−∞,+∞) [3,3]

  10. Extended Alpha-Beta Example“Possible Interval” View Pass outcome to caller, and update caller: Range of possible values [3,+∞) 3 [3,3]

  11. Extended Alpha-Beta Example“Possible Interval” View Do DF-search until first leaf: [3,+∞) Child inherits current interval [3,+∞) [3,3]

  12. Extended Alpha-Beta Example“Possible Interval” View See first leaf, update possible interval: [3,+∞) No remaining possibilities [3,2] [3,3] This node is worse for MAX

  13. Extended Alpha-Beta Example“Possible Interval” View Prune!! Play will never reach the other nodes: [3,+∞) Do these nodes matter? If they = +1 million? If they = −1 million? [3,2] [3,3] We would prune if we had seen any value  3 Prune!!

  14. Extended Alpha-Beta Example“Possible Interval” View Pass outcome to caller, and update caller: [3,+∞) MAX level, 3 ≥ 2, no change 2 [3,2] [3,3]

  15. General alpha-beta pruning • Consider node n in the tree --- • If player has a better choice at: • Parent node of n • Or any choice point further up • Then n is never reached in play. • So: • When that much is known about n, it can be pruned.

  16. Alpha-beta Algorithm • Depth first search • only considers nodes along a single path from root at any time a = highest-value choice found at any choice point of path for MAX (initially, a = −infinity) b = lowest-value choice found at any choice point of path for MIN (initially,  = +infinity) • Pass current values of a and b down to child nodes during search • Update values of a and b during search: • MAX updates  at MAX nodes • MIN updates  at MIN nodes • Prune remaining branches at a node when a ≥ b

  17. When to Prune? • Prune whenever  ≥ . • Prune below a Max node whose alpha value becomes ≥ to the beta value of its ancestors. • Max nodes update alpha based on children’s returned values. • Prune below a Min node whose beta value becomes  the alpha value of its ancestors. • Min nodes update beta based on children’s returned values.

  18. Pseudocode for Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value v

  19. Pseudocode for Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in ACTIONS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  - ∞ for a in ACTIONS(state) do v MAX(v, MIN-VALUE(Result(s,a),  , )) ifv ≥ then returnv  MAX( ,v) return v (MIN-VALUE is defined analogously)

  20. Extended Alpha-Beta Example“Alpha-Beta” View: (α,β) Initially, possibilities are unknown (α=−,β=+): =−  =+ , , initial values

  21. Extended Alpha-Beta Example“Alpha-Beta” View Do DF-search until first leaf: =−  =+ =−  =+ Child inherits current  and

  22. Extended Alpha-Beta Example“Alpha-Beta” View See first leaf, MIN level, MIN updates β: =−  =+ =−  =3 MIN updates  at MIN nodes • <  so no pruning

  23. Extended Alpha-Beta Example“Alpha-Beta” View See remaining leaves, and outcome is known: =−  =+ =−  =3

  24. Extended Alpha-Beta Example“Alpha-Beta” View Pass outcome to caller, MAX updates α: =3  =+ MAX updates α at MAX nodes 3

  25. Extended Alpha-Beta Example“Alpha-Beta” View Do DF-search until first leaf: =3  =+ Child inherits current (α,β) =3  =+

  26. Extended Alpha-Beta Example“Alpha-Beta” View See first leaf, MIN level, MIN updates β : αβ !! What does this mean? =3  =+  =3  =2 This node is worse for MAX

  27. Extended Alpha-Beta Example“Alpha-Beta” View Prune!! Play will never reach the other nodes: αβ !! Prune!! =3  =+ Do these nodes matter? If they = +1 million? If they = −1 million?  =3  =2 Prune!!

  28. Extended Alpha-Beta Example“Alpha-Beta” View Pass outcome to caller, and update caller: MAX level, 3 ≥ 2, no change =3  =+ 2

  29. Effectiveness of Alpha-Beta Search • Worst-Case • branches are ordered so that no pruning takes place. In this case alpha-beta gives no improvement over exhaustive search • Best-Case • each player’s best move is the left-most child (i.e., evaluated first) • in practice, performance is closer to best rather than worst-case • E.g., sort moves by the remembered move values found last time. • E.g., expand captures first, then threats, then forward moves, etc. • E.g., run Iterative Deepening search, sort by value last iteration. • In practice often get O(b(d/2)) rather than O(bd) • this is the same as having a branching factor of sqrt(b), • (sqrt(b))d = b(d/2),i.e., we effectively go from b to square root of b • e.g., in chess go from b ~ 35 to b ~ 6 • this permits much deeper search in the same amount of time

  30. Iterative (Progressive) Deepening • In real games, there is usually a time limit T on making a move • How do we take this into account? • In practice, Iterative Deepening Search (IDS) is used • IDS runs depth-first search with an increasing depth-limit • when the clock runs out we use the solution found at the previous depth limit • Added benefit with Alpha-Beta Pruning: • Remember node values found at the previous depth limit • Sort current nodes so each player’s best move is lleft-most child • Likely to yield good Alpha-Beta Pruning => better, faster search • Only a heuristic: node values will change with the deeper search • Usually works well in practice

  31. Final Comments about Alpha-Beta Pruning • Pruning does not affect final results • Entire subtrees can be pruned. • Good move ordering improves pruning • Order nodes so player’s best moves are checked first • Repeated states are still possible. • Store them in memory = transposition table

  32. Example Max -which nodes can be pruned? Min Max 6 5 3 4 1 2 7 8

  33. Example Max -which nodes can be pruned? Min Max 6 5 3 4 1 2 7 8 Answer: NONE! Because the most favorable nodes for both are explored last (i.e., in the diagram, are on the right-hand side).

  34. Second Example(the exact mirror image of the first example) Max -which nodes can be pruned? Min Max 4 3 6 5 8 7 2 1

  35. Second Example(the exact mirror image of the first example) Max -which nodes can be pruned? Min Max 4 3 6 5 8 7 2 1 Answer: LOTS! Because the most favorable nodes for both are explored first (i.e., in the diagram, are on the left-hand side).

  36. MAX MIN A B C D 1 6 3 6 6 8 6 6 3 1 2 3 1 9 4 5 9 9 4 6 E F G H I J K MAX 5 4 5 Long Detailed Alpha-Beta ExampleBranch nodes are labeled A-K for easy discussion =− =+ , , initial values

  37. A B C D 6 8 6 3 1 6 2 3 5 6 6 1 1 9 3 4 9 9 4 6 E F G H I J K 5 4 5 Long Detailed Alpha-Beta ExampleNote that search cut-off occurs at different depths =− =+ current , , passed to kids MAX =− =+ kid=A MIN =− =+ kid=E MAX

  38. A B C D 6 6 1 1 3 6 8 9 6 6 2 3 1 5 4 3 9 9 4 6 E F G H I J K 4 5 4 5 Long Detailed Alpha-Beta Example =− =+ see first leaf, MAXupdates  MAX =− =+ kid=A MIN =4 =+ kid=E MAX 4 We also are running MiniMax search and recording node values within the triangles, without explicit comment.

  39. A B C D 6 1 1 3 6 8 9 6 6 6 2 3 1 4 5 3 9 9 4 6 E F G H I J K 5 5 4 5 Long Detailed Alpha-Beta Example =−  =+ see next leaf, MAXupdates  MAX =− =+ kid=A MIN =5 =+ kid=E MAX 5

  40. A B C D 6 1 1 3 6 8 9 6 6 6 2 3 1 4 5 3 9 9 4 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ see next leaf, MAXupdates  MAX =− =+ kid=A MIN =6 =+ kid=E MAX 6

  41. A B C D 6 1 1 3 6 8 9 6 6 6 2 3 1 4 5 3 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ return node value, MIN updates  MAX =− =6 kid=A MIN 6 MAX

  42. A B C D 6 1 1 3 6 8 9 6 6 6 2 3 1 4 5 3 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ current , , passed to kid F MAX =− =6 kid=A MIN =− =6 kid=F MAX

  43. A B C D 6 6 1 1 3 6 8 9 6 6 2 3 1 5 4 3 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ see first leaf, MAXupdates  MAX =− =6 kid=A MIN =6 =6 kid=F 6 MAX 6

  44. A B C D 6 6 1 1 3 6 8 9 3 6 6 2 3 4 5 1 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example αβ !! Prune!! =−  =+ MAX =− =6 kid=A MIN =6 =6 kid=F 6 MAX X X

  45. A B C D 6 6 1 1 3 6 8 9 3 6 6 2 3 4 5 1 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ return node value, MIN updates , no change to  MAX =− =6 kid=A MIN 6 6 MAX X X If we had continued searching at node F, we would see the 9 from its third leaf. Our returned value would be 9 instead of 6. But at A, MIN would choose E(=6) instead of F(=9). Internal values may change; root values do not.

  46. A B C D 6 6 1 1 3 6 8 9 3 6 6 2 3 4 5 1 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =−  =+ see next leaf, MIN updates , no change to  MAX =− =6 kid=A MIN 9 6 MAX X X

  47. 6 A B C D 6 6 1 1 3 6 8 6 3 6 2 3 4 5 9 1 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =6  =+ return node value, MAX updates α MAX 6 MIN 6 MAX X X

  48. 6 A B C D 6 6 1 1 3 6 8 6 3 6 2 3 4 5 9 1 9 9 4 6 6 E F G H I J K 6 5 4 5 Long Detailed Alpha-Beta Example =6  =+ current , , passed to kids MAX =6 =+ kid=B MIN =6 =+ kid=G 6 MAX X X

  49. 6 A B C D 6 6 1 1 3 6 8 9 6 6 2 1 4 5 3 3 9 9 4 6 6 E F G H I J K 6 5 5 4 5 Long Detailed Alpha-Beta Example =6  =+ see first leaf, MAXupdates , no change to  MAX =6 =+ kid=B MIN =6 =+ kid=G 6 MAX 5 X X

  50. 6 A B C D 6 6 1 1 3 6 8 9 6 6 2 1 4 5 3 3 9 9 4 6 6 E F G H I J K 6 5 5 4 5 Long Detailed Alpha-Beta Example =6  =+ see next leaf, MAXupdates , no change to  MAX =6 =+ kid=B MIN =6 =+ kid=G 6 MAX 4 X X

More Related