740 likes | 916 Views
Games, Theory and Application. Jaap van den Herik and Jeroen Donkers Institute for Knowledge and Agent Technology, IKAT Department of Computer Science Universiteit Maastricht The Netherlands SOFSEM 2004 Prague, Hotel VZ Merin Sunday, January 25 8.30-10.00h. Contents.
E N D
Games, Theory and Application Jaap van den Herik and Jeroen Donkers Institute for Knowledge and Agent Technology, IKAT Department of Computer Science Universiteit Maastricht The Netherlands SOFSEM 2004 Prague, Hotel VZ Merin Sunday, January 25 8.30-10.00h
Contents • Games in Artificial-Intelligence Research • Game-tree search principles • Using opponent models • Opponent-Model search • Gains and risks • Implementing OM search: complexities • Past future of computer chess
Games, such as Chess, Checkers, and Go are an attractive pastime and scientifically interesting
Chess • Much research has been performed in Computer Chess • Deep Blue (IBM) defeated the world champion Kasparov in 1997 • A Micro Computer better than the world champion?
World Champion Programs • KAISSA 1974 Stockholm • CHESS 1977 Toronto • BELLE 1980 Linz • CRAY BLITZ 1983 New York • CRAY BLITZ 1986 Keulen • DEEP THOUGHT 1989 Edmonton • REBEL 1992 Madrid • FRITZ 1995 Hong Kong • SHREDDER 1999 Paderborn • JUNIOR 2002 Maastricht • SHREDDER 2003 Graz • ? 2004 Ramat-Gan
International Draughts • Buggy best draughts program • Human better than computer, but the margin is small • Challenge: More knowledge in program
Go • Computer Go programs are weak • Problem: recognition of patterns • Top Go programs: Go4++, Many Faces of Go, GnuGo, and Handtalk
Awari • A Mancala game (pebble-hole game) • The game is a draw • Solved by John Romein and Henri Bal (VU Amsterdam) 2002 • All positions (900 billion) with a cluster computer of 144 1GHz processors and 72Gb ram computed in 51 hour • Proven that even the best computer programs still make many errors in their games
Scrabble • Maven beats every human opponent • Author Brian Sheppard • Ph.D. (UM): “Towards Perfect Play of Scrabble” (02) • Maven can play scrabble in any language
Computer Olympiad • Initiative of David Levy (1989) • Since 1989 there have been 8 olympiads; 4x Londen, 3x Maastricht, 1x Graz • Goal: • Finding the best computer program for each game • Connecting programmers / researchers of different games • Computers play each other in competition • Demonstrations: • Man versus Machine • Man + Machine versus Man + Machine
Computer Olympiad • Last time in Graz 2004 • Also World Championship Computer Chess and World Championship Backgammon • 80 particpants in several categories • Competitions in olympiad’s history: Abalone, Awari, Amazons, Backgammon, Bao, Bridge, Checkers, Chess, Chinese Chess, Dots and Boxes, Draughts, Gipf, Go-Moku, 19x19 Go, 9x9 Go, Hex, Lines of Action, Poker, Renju, Roshambo, Scrabble, and Shogi
Gipfted - Gipf MIA - LOA Anky - Amazons Bao 1.0 Magog - Go UM Programs on the Computer Olympiads
Computer Game-playing • Can computers beat humans in board games like Chess, Checkers, Go, Bao? • This is one of the first tasks of Artificial Intelligence (Shannon 1950) • Successes obtained in Chess (Deep Blue), Checkers, Othello, Draughts, Backgammon, Scrabble...
Computer Game-Playing • Sizes of game trees: • Nim-5: 28 nodes • Tic-Tac-Toe: 105 nodes • Checkers: 1031 nodes • Chess: 10123 nodes • Go: 10360 nodes • In practice it is intractable to find a solution with minimax: so use heuristic search
Three Techniques • Minimax Search • α-β Search • Opponent Modelling
Game-playing • Domain: two-player zero-sum game with perfect information (Zermelo, 1913) • Task: find best response to opponent’s moves, provided that the opponent does the same • Solution: Minimax procedure (von Neumann, 1928)
Minimax Search Nim-5 Players remove on turn 1, 2, or 3 matches. The player who takes the last match wins the game.
+1 –1 –1 +1 –1 +1 +1 –1 +1 +1 –1 +1 –1 Minimax Search 5 1 2 3 4 3 2 1 2 3 1 2 3 1 2 3 2 1 2 1 0 1 0 1 2 1 1 2 1 1 2 3 1 2 1 0 1 0 0 1 0 0 0 1 2 1 1 1 1 0 0 0 0 1 0
Minimax Search +1 1 2 3 MINIMAXING +1 –1 –1 1 2 3 1 2 3 1 2 +1 +1 +1 +1 +1 –1 +1 –1 1 2 1 1 2 1 1 2 3 1 –1 –1 +1 –1 +1 +1 –1 +1 +1 +1 1 2 1 1 1 +1 –1 –1 –1 –1 1 +1 +1 –1 –1 +1 –1 +1 +1 –1 +1 +1 –1 +1 –1
Pruning • You do not need the total solution to play (well): only the move at the root is needed • Methods exist to find this move without a need for a total solution: pruning • Alpha-beta search (Knuth & Moore ‘75) • Proof-number search (Allis et al. ‘94), etc. • Playing is solving a sequence of game trees
Heuristic Search 0.25 1 2 3 0.25 –1 –1 1 2 3 1 2 3 1 2 0.25 0.33 0.5 0.33 0.5 –1 0.5 –1 • Truncate the game tree (depth + selection) • Use a (static heuristic) evaluation function at the leaves to replace pay-offs • Miminax on the reduced game tree
Heuristic Search • The approach works very well in Chess, but... • Is solving a sequence of reduced games the best way to win a game? • Heuristic values are used instead of pay-offs • Additional information in the tree is unused • The opponent is not taken into account • We aim at the last item: opponents
Minimax [3] 2 3 7 2 4 3
α-β algorithm 3 3 β-pruning 2 7 2 4 3
2 4 3 The strength of α-β More than thousand prunings
The importance of α-β algorithm 3 3 β-pruning 2 4 3
The Possibilities Of Chess THE NUMBER OF DIFFERENT, REACHABLE POSITIONS IN CHESS IS (CHINCHALKAR): 1046
A Clever Algorithm (α-β) SAVES THE SQAURE ROOT OF THE NUMBER OF POSSIBILITIES, N, THIS IS MORE THAN 99,999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999%
A Calculation NUMBER OF POSSIBILITIES: 1046 SAVINGS BY α-Β ALGORITHM: 1023 1000 PARALLEL PROCESSORS: 103 POSITIONS PER SECOND: 109 LEADS TO: 1023-12 = 1011 SECONDS A CENTURY IS 109 SECONDS SOLVING CHESS: 102 CENTURIES SO 100 CENTURIES OR 10,000 YEAR
Using opponent’s strategy (NIM-5) –1 1 2 3 –1 –1 1 2 3 1 2 3 1 2 –1 +1 +1 +1 +1 –1 1 2 1 1 2 1 1 2 3 1 –1 –1 –1 +1 +1 –1 +1 +1 1 2 1 1 1 +1 –1 –1 –1 –1 1 +1 “Player 1 never takes 3”
knows that uses this strategy Using opponent’s strategy • Well known Tic-Tac-Toe strategy: R1: make 3-in-a-row if possible R2: prevent opponent’s 3-in-a-row if possible H1: occupy central square if possible H2: occupy corner square if possible
Opponent-Model search Iida, vd Herik, Uiterwijk, Herschberg (1993) Carmel and Markovitch (1993) • Opponent’s evaluation function is known (unilateral: the opponent uses minimax) • This is the opponent model • It is used to predict opponent’s moves • Best response is determined, using the own evaluation function
OM Search • Procedure: • At opponent’s nodes: use minimax (alpha-beta) to predict the opponent’s move. Return the own value of the chosen move • At own nodes: Return (the move with) the highest value • At leaves: Return the own evaluation • Implementation: one-pass or probing
OM Search Example v0: 8 vop: 6 7 v0: 7 vop: 6 v0: 8 vop: 6 7 6 V0: 8 Vop: 9 V0: 7 Vop: 6 V0: 8 Vop: 6 V0: 6 Vop: 7 8 7 8 6
Risks and Gains in OM search • Gain: difference between the predicted move and the minimax move • Risk: difference between the move we expect and the move the opponent really selects • Prediction of the opponent is important, and depends on the quality of opponent model.
Additional risk • Even if the prediction is correct, there are traps for OM search • Let P be the set of positions that the opponent selects, v0 be our evaluation function and vop the opponent’s function.
Four types of error • V0 overestimates a position in P (bad) • V0 underestimates a position in P • Vop underestimates a position that enters P (good for us) • Vop overestimates a position in P
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value
Pruning in OM Search • Pruning at MAX nodes is not possible in OM search, only pruning at MIN nodes can take place (Iida et al, 1993). • Analogous to α-β search, the pruning version is called β-pruning OM search
Pruning Example 7 8 a 7 8 5 7 b c - 9 - 8 7 8 5 7 d e f g 6 8 7 4 - 9 4 6 5 7 - 8 h i j k l m n o p q r s t u v w x y - 9 6 8 7 4 - 5 - 9 - 10 4 6 - 8 - 9 5 7 - 9 - 8
Two Implementations • One-pass: • visit every node at most once • back-up both your own and the opponent’s value • Probing: • At MIN nodes use α-β search to predict the opponent’s move • back-up only one value