1 / 42

Alpha-Beta Example

Alpha-Beta Example. Do DF-search until first leaf. Range of possible values. [-∞,+∞]. [-∞, +∞]. Alpha-Beta Example (continued). [-∞,+∞]. [-∞,3]. Alpha-Beta Example (continued). [-∞,+∞]. [-∞,3]. Alpha-Beta Example (continued). [3,+∞]. [3,3]. Alpha-Beta Example (continued). [3,+∞].

Download Presentation

Alpha-Beta Example

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Alpha-Beta Example Do DF-search until first leaf Range of possible values [-∞,+∞] [-∞, +∞]

  2. Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]

  3. Alpha-Beta Example (continued) [-∞,+∞] [-∞,3]

  4. Alpha-Beta Example (continued) [3,+∞] [3,3]

  5. Alpha-Beta Example (continued) [3,+∞] This node is worse for MAX [3,3] [-∞,2]

  6. Alpha-Beta Example (continued) , [3,14] [3,3] [-∞,2] [-∞,14]

  7. Alpha-Beta Example (continued) , [3,5] [3,3] [−∞,2] [-∞,5]

  8. Alpha-Beta Example (continued) [3,3] [2,2] [3,3] [−∞,2]

  9. Alpha-Beta Example (continued) [3,3] [2,2] [3,3] [-∞,2]

  10. Comments about Alpha-Beta Pruning • Pruning does not affect final results • Entire subtrees can be pruned • Alpha-beta pruning can look twice as far as minimax in the same amount of time

  11. Heuristic Evaluation Function (EVAL) • Idea: produce an estimate of the expected utility of the game from a given position. • Performance depends on quality of EVAL. • Must be able to differentiate between good and bad board states • Exact values not important

  12. Heuristic Evaluation Function (EVAL) • Must be consistent with the utility function • values for terminal nodes (or at least their order) must be the same • should reflect the actual chances of winning • Frequently weighted linear functions are used • E = w1 f1 + w2 f2 + … +wn fn • combination of features, weighted by their relevance • Example in chess • Weights: Pawn=1, knight=bishop=3, rook=5, queen=9

  13. Example Chess Score • Black has: • 5 pawns, 1 bishop, 2 rooks • Score = 1*(5)+3*(1)+5*(2) = 5+3+10 = 18 White has: • 5 pawns, 1 rook • Score = 1*(5)+5*(1) = 5 + 5 = 10 Overall scores for this board state: black = 18-10 = 8 white = 10-18 = -8

  14. Example: Tic-Tac-Toe • simple evaluation function E(s) = (rx + cx + dx) - (ro + co + do) where r,c,d are the numbers of row, column and diagonal lines still available; x and o are the pieces of the two players • 1-ply lookahead • start at the top of the tree • evaluate all 9 choices for player 1 • pick the maximum E-value • 2-ply lookahead • also looks at the opponents possible move • assuming that the opponents picks the minimum E-value

  15. Tic-Tac-Toe 1-Ply E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s11) 8 - 5 = 3 E(s12) 8 - 6 = 2 E(s13) 8 - 5 = 3 E(s14) 8 - 6 = 2 E(s15) 8 - 4 = 4 E(s16) 8 - 6 = 2 E(s17) 8 - 5 = 3 E(s18) 8 - 6 = 2 E(s19) 8 - 5 = 3 X X X X X X X X X

  16. Tic-Tac-Toe 2-Ply E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:1) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:3) 8 - 5 = 3 E(s1:4) 8 - 6 = 2 E(s1:5) 8 - 4 = 4 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 X X X X X X X X X E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:45) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:48) 5 - 4 = 1 O O O X X X O X X O X X X O O O E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 4 - 6 = -2 E(s2:13) 6 - 6 = 0 E(s2:14) 5 - 6 = -1 E(s2:15) 6 -6 = 0 E(s2:16) 5 - 6 = -1 O X X O X X X X X X O O O O O O E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s25) 6 - 5 = 1 E(s26) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s28) 5 - 5 = 0 X O X O X X X X X X O O O O O O

  17. Checkers Case Study • initial board configuration • Black single on 20 single on 21 king on 31 • Redsingle on 23 king on 22 • evaluation functionE(s) = (5 x1 + x2) - (5r1 + r2) where x1 = black king advantage, x2 = black single advantage, r1 = red king advantage, r2 = red single advantage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

  18. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 -8 -8 0 1 29 30 31 32 -8 -8 -4 6 2 6 1 1 1 0 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers MiniMax Example 31 -> 27 20 -> 16 MAX 21 -> 17 31 -> 26 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 13 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 27 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  19. -4 -8 0 1 -8 -8 -4 1 6 1 0 2 1 1 6 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 • a 1 • b 6 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 27 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  20. -4 -8 0 1 -8 -8 -4 1 6 1 0 2 1 1 6 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 • a 1 • b 1 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 27 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  21. -4 -8 0 1 -8 -8 -4 1 0 1 6 2 1 1 6 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b 1 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 b- cutoff: no need to examine further branches 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  22. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b 1 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  23. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b 1 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 b- cutoff: no need to examine further branches 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  24. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b 1 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  25. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b 0 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 13 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  26. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b -4 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  27. -8 -8 -4 1 1 0 2 6 6 1 1 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b -4 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 a- cutoff: no need to examine further branches 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  28. -8 -8 -4 1 1 0 6 1 1 2 6 1 1 1 1 2 6 1 1 1 1 1 1 1 1 1 1 6 0 0 0 -4 -4 -8 -8 -8 -8 Checkers Alpha-Beta Example 1 2 3 4 a 1 b -8 5 6 7 8 9 10 11 12 13 14 15 16 MAX 31 -> 27 20 -> 16 17 18 19 20 21 -> 17 31 -> 26 21 22 23 24 25 26 27 28 29 30 31 32 1 0 -4 -8 MIN 23 -> 32 23 -> 30 22 -> 31 22 -> 18 22 -> 26 22 -> 17 22 -> 18 22 -> 25 23 -> 26 23 -> 27 MAX 16 -> 11 16 -> 11 31 -> 22 31 -> 27 31 -> 24 20 -> 16 21 -> 17 31 -> 27 21 -> 14 31 -> 27 20 -> 16 31 -> 27 21 -> 17 16 -> 11 20 -> 16 21 -> 17 20 -> 16 31 -> 26

  29. Horizon Problem • Moves may have disastrous consequences in the future, but the consequences are not visible • Agent cannot see far enough into search space

  30. Games with Chance • In many games, there is a degree of unpredictability through random elements • throwing dice, card distribution, roulette wheel, … • This requires chance nodes in addition to the Max and Min nodes • branches indicate possible variations • each branch indicates the outcome and its likelihood (probability)

  31. Games with Chance chance nodes

  32. Decisions with Chance • The utility value of a position depends on the random element • the definite minimax value must be replaced by an expected value • Calculation of expected values • utility function for terminal nodes • for all other nodes • calculate the utility for each chance event • weigh by the chance that the event occurs • add up the individual utilities

  33. More interesting (but still trivial) game • Deal four cards face up • Player 1 chooses a card • Player 2 throws a die • If it’s a six, player 2 chooses a card, swaps it with player 1’s and keeps player 1’s card • If it’s not a six, player 2 just chooses a card • Player 1 chooses next card • Player 2 takes the last card

  34. Expectiminimax Diagram

  35. Expectiminimax Calculations

  36. Games and Computers • State of the art for some game programs • Chess • Checkers • Othello • Backgammon • Go

  37. Chess • Deep Blue, a special-purpose parallel computer, defeated the world champion Gary Kasparov in 1997 • the human player didn’t show his best game • some claims that the circumstances were questionable • Deep Blue used a massive data base with games from the literature • Fritz, a program running on an ordinary PC, challenged the world champion Vladimir Kramnik to an eight-game draw in 2002 • top programs and top human players are roughly equal

  38. Checkers • Arthur Samuel develops a checkers program in the 1950s that learns its own evaluation function • reaches an expert level stage in the 1960s • Chinook becomes world champion in 1994 • human opponent, Dr. Marion Tinsley, withdraws for health reasons • Tinsley had been the world champion for 40 years • Chinook uses off-the-shelf hardware, alpha-beta search, end-games data base for six-piece positions

  39. Othello • Logistello defeated the human world champion in 1997 • Many programs play far better than humans • smaller search space than chess • little evaluation expertise available

  40. Backgammon • TD-Gammon, neural-network based program, ranks among the best players in the world • improves its own evaluation function through learning techniques • search-based methods are practically hopeless • chance elements, branching factor

  41. Go • Humans play far better • large branching factor (around 360) • search-based methods are hopeless • Rule-based systems play at amateur level • The use of pattern-matching techniques can improve the capabilities of programs • difficult to integrate • $2,000,000 prize for the first program to defeat a top-level player

  42. Chapter Summary • Many game techniques are derived from search methods • The minimax algorithm determines the best move for a player by calculating the complete game tree • Alpha-beta pruning dismisses parts of the search tree that are provably irrelevant • An evaluation function gives an estimate of the utility of a state when a complete search is impractical • Chance events can be incorporated into the minimax algorithm by considering the weighted probabilities of chance events

More Related