1 / 73

Games, Theory and Application

Games, Theory and Application. Jaap van den Herik and Jeroen Donkers Institute for Knowledge and Agent Technology, IKAT Department of Computer Science Universiteit Maastricht The Netherlands SOFSEM 2004 Prague, Hotel VZ Merin Sunday, January 25 8.30-10.00h. Contents.

jeroen
Download Presentation

Games, Theory and Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Games, Theory and Application Jaap van den Herik and Jeroen Donkers Institute for Knowledge and Agent Technology, IKAT Department of Computer Science Universiteit Maastricht The Netherlands SOFSEM 2004 Prague, Hotel VZ Merin Sunday, January 25 8.30-10.00h

  2. Contents • Games in Artificial-Intelligence Research • Game-tree search principles • Using opponent models • Opponent-Model search • Gains and risks • Implementing OM search: complexities • Past future of computer chess

  3. Games, such as Chess, Checkers, and Go are an attractive pastime and scientifically interesting

  4. Chess • Much research has been performed in Computer Chess • Deep Blue (IBM) defeated the world champion Kasparov in 1997 • A Micro Computer better than the world champion?

  5. World Champion Programs • KAISSA 1974 Stockholm • CHESS 1977 Toronto • BELLE 1980 Linz • CRAY BLITZ 1983 New York • CRAY BLITZ 1986 Keulen • DEEP THOUGHT 1989 Edmonton • REBEL 1992 Madrid • FRITZ 1995 Hong Kong • SHREDDER 1999 Paderborn • JUNIOR 2002 Maastricht • SHREDDER 2003 Graz • ? 2004 Ramat-Gan

  6. International Draughts • Buggy best draughts program • Human better than computer, but the margin is small • Challenge: More knowledge in program

  7. Go • Computer Go programs are weak • Problem: recognition of patterns • Top Go programs: Go4++, Many Faces of Go, GnuGo, and Handtalk

  8. Awari • A Mancala game (pebble-hole game) • The game is a draw • Solved by John Romein and Henri Bal (VU Amsterdam) 2002 • All positions (900 billion) with a cluster computer of 144 1GHz processors and 72Gb ram computed in 51 hour • Proven that even the best computer programs still make many errors in their games

  9. Scrabble • Maven beats every human opponent • Author Brian Sheppard • Ph.D. (UM): “Towards Perfect Play of Scrabble” (02) • Maven can play scrabble in any language

  10. Overview

  11. Computer Olympiad • Initiative of David Levy (1989) • Since 1989 there have been 8 olympiads; 4x Londen, 3x Maastricht, 1x Graz • Goal: • Finding the best computer program for each game • Connecting programmers / researchers of different games • Computers play each other in competition • Demonstrations: • Man versus Machine • Man + Machine versus Man + Machine

  12. Computer versus Computer

  13. Computer Olympiad • Last time in Graz 2004 • Also World Championship Computer Chess and World Championship Backgammon • 80 particpants in several categories • Competitions in olympiad’s history: Abalone, Awari, Amazons, Backgammon, Bao, Bridge, Checkers, Chess, Chinese Chess, Dots and Boxes, Draughts, Gipf, Go-Moku, 19x19 Go, 9x9 Go, Hex, Lines of Action, Poker, Renju, Roshambo, Scrabble, and Shogi

  14. Gipfted - Gipf MIA - LOA Anky - Amazons Bao 1.0 Magog - Go UM Programs on the Computer Olympiads

  15. Computer Game-playing • Can computers beat humans in board games like Chess, Checkers, Go, Bao? • This is one of the first tasks of Artificial Intelligence (Shannon 1950) • Successes obtained in Chess (Deep Blue), Checkers, Othello, Draughts, Backgammon, Scrabble...

  16. Computer Game-Playing • Sizes of game trees: • Nim-5: 28 nodes • Tic-Tac-Toe:  105 nodes • Checkers:  1031 nodes • Chess:  10123 nodes • Go:  10360 nodes • In practice it is intractable to find a solution with minimax: so use heuristic search

  17. Three Techniques • Minimax Search • α-β Search • Opponent Modelling

  18. Game-playing • Domain: two-player zero-sum game with perfect information (Zermelo, 1913) • Task: find best response to opponent’s moves, provided that the opponent does the same • Solution: Minimax procedure (von Neumann, 1928)

  19. Minimax Search Nim-5 Players remove on turn 1, 2, or 3 matches. The player who takes the last match wins the game.

  20. +1 –1 –1 +1 –1 +1 +1 –1 +1 +1 –1 +1 –1 Minimax Search 5 1 2 3 4 3 2 1 2 3 1 2 3 1 2 3 2 1 2 1 0 1 0 1 2 1 1 2 1 1 2 3 1 2 1 0 1 0 0 1 0 0 0 1 2 1 1 1 1 0 0 0 0 1 0

  21. Minimax Search +1 1 2 3 MINIMAXING +1 –1 –1 1 2 3 1 2 3 1 2 +1 +1 +1 +1 +1 –1 +1 –1 1 2 1 1 2 1 1 2 3 1 –1 –1 +1 –1 +1 +1 –1 +1 +1 +1 1 2 1 1 1 +1 –1 –1 –1 –1 1 +1 +1 –1 –1 +1 –1 +1 +1 –1 +1 +1 –1 +1 –1

  22. Pruning • You do not need the total solution to play (well): only the move at the root is needed • Methods exist to find this move without a need for a total solution: pruning • Alpha-beta search (Knuth & Moore ‘75) • Proof-number search (Allis et al. ‘94), etc. • Playing is solving a sequence of game trees

  23. Heuristic Search 0.25 1 2 3 0.25 –1 –1 1 2 3 1 2 3 1 2 0.25 0.33 0.5 0.33 0.5 –1 0.5 –1 • Truncate the game tree (depth + selection) • Use a (static heuristic) evaluation function at the leaves to replace pay-offs • Miminax on the reduced game tree

  24. Heuristic Search • The approach works very well in Chess, but... • Is solving a sequence of reduced games the best way to win a game? • Heuristic values are used instead of pay-offs • Additional information in the tree is unused • The opponent is not taken into account • We aim at the last item: opponents

  25. Minimax [3] 2 3 7 2 4 3

  26. α-β algorithm 3 3 β-pruning 2 7 2 4 3

  27. 2 4 3 The strength of α-β More than thousand prunings

  28. The importance of α-β algorithm 3 3 β-pruning 2 4 3

  29. The Possibilities Of Chess THE NUMBER OF DIFFERENT, REACHABLE POSITIONS IN CHESS IS (CHINCHALKAR): 1046

  30. A Clever Algorithm (α-β) SAVES THE SQAURE ROOT OF THE NUMBER OF POSSIBILITIES, N, THIS IS MORE THAN 99,999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999%

  31. A Calculation NUMBER OF POSSIBILITIES: 1046 SAVINGS BY α-Β ALGORITHM: 1023 1000 PARALLEL PROCESSORS: 103 POSITIONS PER SECOND: 109 LEADS TO: 1023-12 = 1011 SECONDS A CENTURY IS 109 SECONDS SOLVING CHESS: 102 CENTURIES SO 100 CENTURIES OR 10,000 YEAR

  32. Using opponent’s strategy (NIM-5) –1 1 2 3 –1 –1 1 2 3 1 2 3 1 2 –1 +1 +1 +1 +1 –1 1 2 1 1 2 1 1 2 3 1 –1 –1 –1 +1 +1 –1 +1 +1 1 2 1 1 1 +1 –1 –1 –1 –1 1 +1 “Player 1 never takes 3”

  33. knows that uses this strategy Using opponent’s strategy • Well known Tic-Tac-Toe strategy: R1: make 3-in-a-row if possible R2: prevent opponent’s 3-in-a-row if possible H1: occupy central square if possible H2: occupy corner square if possible

  34. Opponent-Model search Iida, vd Herik, Uiterwijk, Herschberg (1993) Carmel and Markovitch (1993) • Opponent’s evaluation function is known (unilateral: the opponent uses minimax) • This is the opponent model • It is used to predict opponent’s moves • Best response is determined, using the own evaluation function

  35. OM Search • Procedure: • At opponent’s nodes: use minimax (alpha-beta) to predict the opponent’s move. Return the own value of the chosen move • At own nodes: Return (the move with) the highest value • At leaves: Return the own evaluation • Implementation: one-pass or probing

  36. OM Search Equations

  37. OM Search Example v0: 8 vop: 6 7 v0: 7 vop: 6 v0: 8 vop: 6 7 6 V0: 8 Vop: 9 V0: 7 Vop: 6 V0: 8 Vop: 6 V0: 6 Vop: 7 8 7 8 6

  38. OM Search Algorithm (probing)

  39. Risks and Gains in OM search • Gain: difference between the predicted move and the minimax move • Risk: difference between the move we expect and the move the opponent really selects • Prediction of the opponent is important, and depends on the quality of opponent model.

  40. Additional risk • Even if the prediction is correct, there are traps for OM search • Let P be the set of positions that the opponent selects, v0 be our evaluation function and vop the opponent’s function.

  41. Four types of error • V0 overestimates a position in P (bad) • V0 underestimates a position in P • Vop underestimates a position that enters P (good for us) • Vop overestimates a position in P

  42. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  43. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  44. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  45. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  46. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  47. Vo: our evaluation Vop: opponent’s evaluation Mmx: minimax value Obt: obtained value

  48. Pruning in OM Search • Pruning at MAX nodes is not possible in OM search, only pruning at MIN nodes can take place (Iida et al, 1993). • Analogous to α-β search, the pruning version is called β-pruning OM search

  49. Pruning Example 7 8 a 7 8 5 7 b c -  9 -  8 7 8 5 7 d e f g 6 8 7 4 - 9 4 6 5 7 - 8 h i j k l m n o p q r s t u v w x y - 9 6 8 7 4 - 5 - 9 - 10 4 6 - 8 - 9 5 7 - 9 - 8

  50. Two Implementations • One-pass: • visit every node at most once • back-up both your own and the opponent’s value • Probing: • At MIN nodes use α-β search to predict the opponent’s move • back-up only one value

More Related