G5AIAIIntroduction to AI Game Playing Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM. Instructor: Ho Sooi Hock
Game Playing • Game playing was one of the earliest researched AI problems because • games seem to require intelligence (heuristics) • state of games are easy to represent • only a restricted number of actions are allowed and outcomes are defined by precise rules • It has been studied for a long time • Babbage (tic-tac-toe) • Turing (chess) • Purpose is to get a game solved, meaning that the entire game tree can be built and outcome of the game can be determined even before the game is started.
Classes of Games • Two-players vs. Multi-players • Chess vs. Monopoly, Poker • Deterministic vs Non-deterministic • Chess vs. Backgammon • Perfect Information vs. Imperfect Information • Checkers vs. Bridge
Game Playing - Chess • Shannon - March 9th 1949 - New York • size of search space (10120 - average of 40 moves) • 1957 - Newell and Simon predicted that a computer would be chess champion within ten years • Simon : “I was a little far-sighted with chess, but there was no way to do it with machines that were as slow as the ones way back then” • 1958 - First computer to play chess was an IBM 704 - about one millionth capacity of Deep Blue
Game Playing - Chess • 1967 : Mac Hack competed successfully in human tournaments • 1983 : “Belle” obtained expert status from the United States Chess Federation • Mid 80’s : Scientists at Carnegie Mellon University started work on what was to become Deep Blue. • Project moved to IBM in 1989
Game Playing - Chess • May 11th 1997, Gary Kasparov lost a six match game to Deep Blue • 3.5 to 2.5 • Two wins for Deep Blue, one win for Kasparov and three draws IBM Research : (http://www.research.ibm.com/deepblue/meet/html/d.3.html)
Game Playing - Checkers • Arthur Samuel - 1952 • Written for an IBM 701 • 1954 - Re-wrote for an IBM 704 • 10,000 words of main memory • Added a learning mechanism that learnt its own evaluation function • Learnt the evaluation function by playing against itself • After a few days it could beat its creator • …. And compete on equal terms with strong human players
Game Playing - Checkers • Jonathon Schaeffer - 1996 • Testament to Samuel that there was no more work done until the early nineties • Developed Chinook • Uses alpha-beta search • Plays a perfect end game by means of a database • In 1992 Chinook won the US Open • ….. And challenged for the World Championship
Game Playing - Checkers • Dr Marion Tinsley • Had been world championship for over 40 years • … only losing three games in all that time • Against Chinook he suffered his fourth and fifth defeat • ….. But ultimately won 21.5 to 18.5 • In August 1994 there was a re-match but Marion Tinsley withdrew for health reasons • Chinook became the official world champion
Game Playing - Checkers • Schaeffer claimed Chinook was rated at 2814 • The best human players are rated at 2632 and 2625 • Chinook did not include any learning mechanism • Chellapilla and Fogel - 2000 • “Learnt” how to play a good game of checkers • Anaconda, also known as Blondie24 • The program used a population of games with the best competing for survival
Game Playing - Checkers • Learning was done using a neural network with the synapses being changed by an evolutionary strategy • The best program beat a commercial application 6-0 • The program was presented at CEC 2000 (San Diego) and remain undefeated
Components of a Game Problem A game can be defined as a kind of search problem with the following components : The initial state: board position, indication of whose move it is A set of operators: define the legal moves that a player can make A terminal test: determines when the game is over (terminal states) A utility (payoff) function: gives a numeric value for the outcome of a game (-1,+1,0) ?
Minimax • Game playing • An opponent tries to thwart your every move • 1944 - John von Neumann outlined a search method (Minimax) that maximise your position whilst minimising your opponent’s • Minimax searches state-space using the following assumptions • your opponent is ‘as clever’ as you • if your opponent can make things worse for you, they will take that move • your opponent won’t make mistake
1 -3 C B D E F G 4 1 2 -3 4 -5 -5 1 -7 2 -3 -8 Minimax - Example MAX 1 A MIN MAX = terminal position = agent = opponent
+ + + Minimax - Nim • Nim • Start with a pile of tokens, e.g. 7 • At each move the player must divide the tokens into two non-empty, non-equal piles
6-1 5-2 4-3 5-1-1 4-2-1 3-2-2 3-3-1 4-1-1-1 3-2-1-1 2-2-2-1 3-1-1-1-1 2-2-1-1-1 2-1-1-1-1-1 Nim’s Search Tree 7
Minimax - Nim • Draw the complete search tree • Assuming MIN plays first, complete the MIN/MAX tree • Assume that a utility function of • 0 = a win for MIN • 1 = a win for MAX
7 MIN MAX 6-1 5-2 4-3 MIN 5-1-1 4-2-1 3-2-2 3-3-1 MAX 4-1-1-1 3-2-1-1 2-2-2-1 MIN 3-1-1-1-1 2-2-1-1-1 MAX 2-1-1-1-1-1 1 1 1 1 0 1 0 1 1 0 0 0 1 0
Minimax to a Fixed Ply-Depth • Usually not possible to expand a game to end-game status • have to choose a ply-depth that is achievable with reasonable time and resources • absolute ‘win-lose’ values become heuristic scores • heuristics are devised according to knowledge of the game
Evaluation Functions e.g. Chess, possible heuristic :- • A weighted sum of lots of different factors, such as:- • piece capture (with different values for each piece) • position of king • freedom of movement in attack … h’ = A * total piece value captured + B * some eval. of king position + C * some other function …etc...
Quiescence Search • Always search to limited depth - blind to states just beyond this boundary • Might chose a path on the basis that it terminates in a good heuristic … but next move could be catastrophic • Example, might chose a path based on possibility of taking a bishop. The next move after this could be the loss of your queen. Having started down this path could mean losing the game • Overcome by quiescence search – positions in which favorable captures are avoided by extending search depth to a quiescent position
Horizon Effect • When inevitable bad moves are put off beyond the cutoff on the depth of lookahead, e.g. stalling moves by pushing an unfavorable outcome “over the search horizon” to a place where it cannot be detected • The unavoidable damaging move is thus not dealt with • Use of singular extensions where one “clearly better” move is searched beyond the normal depth limit without incurring more cost
Alpha-Beta Pruning • Fixed-depth Minimax searches entire space down to a certain level, in a breadth-first fashion. Then backs values up. Some of this is wasteful search, as we shall see here • Alpha-Beta pruning identifies paths which need not be explored any further
Alpha-Beta Pruning • Traverse the search tree in depth-first order • At each MAX node n, alpha(n) = maximum value found so far • At each MIN node n, beta(n) = minimum value found so far • Note: The alpha values start at -infinity and only increase, while beta values start at +infinity and only decrease. • Beta cutoff: Given a MAX node n, cut off the search below n (i.e., don’t generate or examine any more of n’s children) if alpha(n) >= beta(i) for some MIN node ancestor i of n. • Alpha cutoff: stop searching below MIN node n if beta(n) <= alpha(i) for some MAX node ancestor i of n.
B K J I H C D E A MAX MIN <=6 MAX 6 >=8 6 5 8 = agent = opponent
B M L K J I H C D E F G A MAX >=6 MIN 6 <=2 MAX 6 >=8 2 6 5 8 2 1 = agent = opponent
B M L K J I H C D E F G A MAX >=6 MIN 6 2 MAX 6 >=8 2 6 5 8 2 1 = agent = opponent
B M L K J I H C D E F G Alpha-beta Pruning A MAX 6 MIN 6 2 alpha cutoff MAX 6 >=8 beta cutoff 2 6 5 8 2 1 = agent = opponent
Summary • Game playing • Chess • Checkers • Both minimax and alpha-beta pruning assume perfect play from the opposition • Increase in processing power will not always make exhaustive search of game tree possible – need pruning techniques to increase the search depth with the same computing resources
Most of the lecture slides are adapted from the same module taught in Nottingham, UK campus by Dr. Graham Kendall Disclaimer