Understanding Minimax and Alpha-Beta Pruning in Game Theory and AI

Game Playing

Outline • Perfect Play • Resource Limits • Alpha-Beta pruning • Games of Chance

Games vs Search problems • Unpredictable opponent • use contingency plan • Time limits • only approximate solution • Approaches • algorithm for perfect play • finite horizon, approximation • pruning to reduce costs

Types of Games

Tic-Tac-Toe Example

Initial State starting configuration and whose move Operators legal moves Terminal Test Checks if game is over Utility Function Evaluates who has won and by how much Structure of a Game

Minimax

Minimax Algorithm function MINIMAX-DECISION(game) returns an operator for each op in OPERATORS[game] do VALUE[op] <- MINIMAX-VALUE(APPLY(op, game), game) end return the op with the highest VALUE[op] function MINIMAX-VALUE(state,game) returns a utility value if op TERMINAL-TEST[game](state) then return UTILITY[game](state) else if MAX is to move in state then return the highest MINIMAX-VALUE of SUCCESSORS(state) else return the lowest MINIMAX-VALUE of SUCCESSORS(state)

Properties of MiniMax • Complete ? • Yes (if tree finite) • Optimal ? • Yes, against optimal opponent • Time Complexity ? • O(bm) • Space Complexity ? • O(bm) (depth first exploration)

Resource Limits • Time complexity means given time limits • limited choice of solution • Approach • cutoff test (depth limit) • evaluation function (estimate of desirability of position)

Evaluation Functions • Normally weighted linear sum of features • Eval(s) = w1f1(s) + w2f2(s) + .. + wnfn(s)

Cutting off search • MinimaxCutoff identical to MinimaxValue except • TERMINAL? Replaced by CUTOFF? • UTILITY replaced by EVAL • In practice if bm = 106, b = 35 => m = 4 • 4-ply - human novice • 8-ply - typical PC or human master • 12-ply - Deep Blue, grand master

MAX 3 A1 A3 A2 <=5 MIN 3 X <=2 X<=2 <=14 A23 A13 A33 A11 A21 A31 A12 A22 A32 2 3 12 X 8 X 14 X 5 5 X X 2 Alpha-Beta pruning Example

Properties of Alpha-Beta • pruning doesn’t effect final result • Ordering improves efficiency of pruning • “perfect ordering”, time complexity O(bm/2) • doubles depth of search • In practice, time complexity O(b3m/4)

Alpha-Beta Algorithm function MAX-VALUE(state, game, alpha, beta) returns the minimax value of a state inputs: state, current state in game game, game description alpha, the best score for MAX along the path to state beta, the best score for MIN along the path to state if CUT-OFF(state) then return EVAL(state) for each s in SUCCESSORS(state) do alpha <- MAX(alpha, MIN-VALUE(s, game, alpha, beta)) if alpha >= beta then return beta end return alpha

Alpha-Beta cont. function MIN-VALUE(state, game, alpha, beta) returns the minimax value of a state if CUT-OFF(state) then return EVAL(state) for each s in SUCCESSORS(state) do beta <- MIN(beta, MAX-VALUE(s, game, alpha, beta)) if alpha >= beta then return alpha end return beta

Deterministic Games • Checkers:- • Chinook, 1994 • Chess • Deep Blue, 1997 • Othello • computers too good • Go • computers too bad

Non-deterministic Games • Chance adds difficulty • dice roll, deal of cards, flip of coin • ExpectiMax, like MiniMax • with additional chance nodes

Backgammon

expectimax(C) = sumi(p(di) maxs(utility(s))) C: chance node, di: dice roll p(di): probability of roll occurring maxs(utility(s)): max utility possible after dice roll Time Complexity O(bm nm) makes problems even harder to solve ExpectiMiniMax

Understanding Minimax and Alpha-Beta Pruning in Game Theory and AI

Understanding Minimax and Alpha-Beta Pruning in Game Theory and AI

Presentation Transcript

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game-Playing

Game Playing

Game Playing

Game Playing

GAME PLAYING

Game Playing

Game playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing

Game Playing