Minimax search algorithm l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 57

Minimax search algorithm PowerPoint PPT Presentation

Minimax search algorithm For now, we assume that exhaustive search is possible. Generate the entire game tree. Assume it has depth d . For each terminal state, apply the payoff function to get its score.

Download Presentation

Minimax search algorithm

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Minimax search algorithm l.jpg

Minimax search algorithm

For now, we assume that exhaustive search is possible.

  • Generate the entire game tree. Assume it has depth d.

  • For each terminal state, apply the payoff function to get its score.

  • Back-up the scores at level d to assign a score to each node at level d-1. If level d is MAX's move, then select the best score from the node's children. If level d is MIN's move, then select the worst score from the node's children.

  • Backup the scores all the way up the tree, until the root node gets to choose the maximum score among its children. This is the minimax decision that determines the best move to make!


The evaluation function l.jpg

The Evaluation Function

  • If we do not reach the end of the game how do we evaluate the payoff of the leaf states?

  • Use a static evaluation function.

    • A heuristic function that estimates the utility of board positions.

    • Desirable properties

      • Must agree with the utility function

      • Must not take too long to evaluate

      • Must accurately reflect the chance of winning

  • An ideal evaluation function can be applied directly to the board position.

  • It is better to apply it as many levels down in the game tree as time permits


Evaluation function for chess l.jpg

Evaluation Function for Chess

  • Relative material value

    • Pawn = 1, knight = 3, bishop = 3, rook = 5, queen = 9

  • Good pawn structure

  • King safety


Revised minimax algorithm l.jpg

Revised Minimax Algorithm

For the MAX player

  • Generate the game as deep as time permits

  • Apply the evaluation function to the leaf states

  • Back-up values

    • At MIN ply assign minimum payoff move

    • At MAX ply assign maximum payoff move

  • At root, MAX chooses the operator that led to the highest payoff


Minimax procedure l.jpg

Minimax Procedure

  • minimax(board, depth, type)

    • If depth = 0 return Eval-Fn(board)

    • else if type = max

      • cur-max = -inf

      • loop for b in succ(board)

        • b-val = minimax(b,depth-1,min)

        • cur-max = max(b-val,cur-max)

      • return cur-max

    • else (type = min)

      • cur-min = inf

      • loop for b in succ(board)

        • b-val = minimax(b,depth-1,max)

        • cur-min = min(b-val,cur-min)

      • return cur-min


Minimax l.jpg

Minimax

max

min

max

min


Minimax7 l.jpg

Minimax

max

10

min

10

2

max

10

14

2

24

min

10

9

14

13

2

1

3

24


Note exact values do not matter l.jpg

Note: exact values do not matter


Problems with fixed depth search l.jpg

Problems with fixed depth search

Most interesting games cannot be searched exhaustively, so a fixed depth cutoff must be applied. But this can cause problems ...

Quiescence: If you arbitrarily apply the evaluation function at a fixed depth, you might miss a huge swing that is about to happen. The evaluation function should only be applied to quiescent (stable) positions. (Requires game knowledge!)

The Horizon Effect: Search has to stop somewhere, but a huge change might be lurking just over the horizon. There is no general fix but heuristics can sometimes help.


Pruning l.jpg

Pruning

  • Suppose your program can search 1000 positions/second.

  • In chess, you get roughly 150 seconds per move so you can search 150,000 positions.

  • Since chess has a branching factor of about 35, your program can only search 34 ply!

  • An average human plans 68 moves ahead so your program will act like a novice.

  • Fortunately, we can often avoid searching parts of the game tree by keeping track of the best and worst alternatives at each point. This is called pruning the search tree.


Alpha beta pruning l.jpg

Alpha-Beta Pruning

  • Alpha-beta pruning is used on top of minimax search to detect paths that do not need to be explored. The intuition is:

  • The MAX player is always trying to maximize the score. Call this .

  • The MIN player is always trying to minimize the score. Call this  .

  • When a MIN node has  <= the  of its MAX ancestors, then this path will never be taken. (MAX has a better option.) This is called an -cutoff.

  • When a MAX node has >= the  of its MIN ancestors, then this path will never be taken. (MIN has a better option.) this is called an -cutoff


Bounding search l.jpg

Bounding Search

The minimax procedure explores every path of length depth. Can we do less work?

A

MAX

B

C

D

MIN

E

F

G

H

I

J

K

L


Bounding search13 l.jpg

Bounding Search

(3)

A

MAX

B (3)

C

D

MIN

E (3)

F (12)

J

K

L

G (8)

H

I


Bounding search14 l.jpg

Bounding Search

(3)

A

MAX

B (3)

C (<-5)

D

MIN

E (3)

F (12)

J

K

L

G (8)

H (-5)

I


Bounding search15 l.jpg

Bounding Search

A (3)

MAX

B (3)

C (<-5)

D (2)

MIN

E (3)

F (12)

J (15)

K (5)

L (2)

G (8)

H (-5)

I


2 pruning search cutoff l.jpg

2. - pruning: search cutoff

  • Pruning: eliminating a branch of the search tree from consideration without exhaustive examination of each node

  • - pruning:the basic idea is to prune portions of the search tree that cannot improve the utility value of the max or min node, by just considering the values of nodes seen so far.

  • Does it work? Yes, in roughly cuts the branching factor from b to b resulting in double as far look-ahead than pure minimax


Pruning example l.jpg

- pruning: example

 6

MAX

MIN

6

6

12

8


Pruning example18 l.jpg

- pruning: example

 6

MAX

MIN

6

 2

6

12

8

2


Pruning example19 l.jpg

- pruning: example

 6

MAX

MIN

 5

6

 2

6

12

8

2

5


Pruning example20 l.jpg

- pruning: example

 6

MAX

Selected move

MIN

 5

6

 2

6

12

8

2

5


Pruning general principle l.jpg

- pruning: general principle

Player

m

Opponent

If  > v then MAX will chose m so prune tree under n

Similar for for MIN

Player

n

v

Opponent


Properties of alpha beta pruning l.jpg

Properties of Alpha-Beta Pruning

  • Alpha-beta pruning is guaranteed to find the same best move as the minimax algorithm by itself, but can drastically reduce the number of nodes that need to be explored.

  • The order in which successors are explored can make a dramatic difference!

  • In the optimal situation, alpha-beta pruning only needs to explore O(bd/2 ) nodes.

  • Minimax search explores O(b d ) nodes, so alpha-beta pruning can afford to double the search depth!

  • If successors are explored randomly, alpha-beta explores about O(b3d/4).

  • In practice, heuristics often allow performance to be closer to the best-case scenario.


Alphabeta pruning algorithm l.jpg

AlphaBeta Pruning Algorithm

procedure alpha-beta-max(node, , )

if leaf node(node) then return evaluation(node);

foreach (successor s of node)

 := max(,alpha-beta-min(s, , ));

if  >=  then return ;

return ;

procedure alpha-beta-min(node, , )

if leaf node(node) then return evaluation(node);

foreach (successor s of node)

 := min(,alpha-beta-max(s, , );

if  <=  then return ;

return ;

To begin, we invoke: alpha-beta-max(node,1,1)


Alpha beta pruning27 l.jpg

Alpha-beta pruning

  • Pruning does not affect final result

  • Alpha-beta pruning

    • Asymptotic time complexity

      • O((b/log b)d)

    • With “perfect ordering,” time complexity

      • O(bd/2)

      • means we go from an effective branching factor of b to sqrt(b) (e.g. 35 -> 6).


A b procedure l.jpg

a-b Procedure

  • minimax-a-b(board, depth, type, a, b)

    • If depth = 0 return Eval-Fn(board)

    • else if type = max

      • cur-max = -inf

      • loop for b in succ(board)

        • b-val = minimax-a-b(b,depth-1,min, a, b)

        • cur-max = max(b-val,cur-max)

          a= max(cur-max, a)

          if cur-max >= b finish loop

      • return cur-max

    • else type = min

      • cur-min = inf

      • loop for b in succ(board)

        • b-val = minimax-a-b(b,depth-1,max, a, b)

        • cur-min = min(b-val,cur-min)

          b= min(cur-min, b)

          if cur-min <= a finish loop

      • return cur-min


A b pruning example l.jpg

a-b Pruning Example

max

min

max

min


A b pruning example30 l.jpg

a-b Pruning Example

max

10

min

10

4

max

4

10

14

min

10

2

9

14

4


Now you do it l.jpg

Now, you do it!

Max

Min

Max

Min


Move ordering heuristics l.jpg

A (3)

B (3)

C (<-5)

D (2)

E

(3)

F

(12)

J

(15)

K

(5)

L

(2)

G

(8)

H

(-5)

I

Move Ordering Heuristics

Good move ordering improves effectiveness of pruning

MAX

A (3)

B (3)

C (<-5)

D (<2)

MIN

E

(3)

F

(12)

G

(8)

H

(-5)

I

L

(2)

K

(5)

J

(15)

Original Ordering

Better Ordering


Using book moves l.jpg

Using Book Moves

  • Use catalogue of “solved” positions to extract the correct move.

  • For complicated games, such catalogues are not available for all positions

  • Often, sections of the game are well-understood and catalogued

    • E.g. openings and endings in chess

  • Combine knowledge (book moves) with search (minimax) to produce better results.


Slide34 l.jpg

  • http://www.gametheory.net/applets/


Games with chance l.jpg

Games with Chance

  • How to include

    chance –

    Add chance node


Decision making in game of chance l.jpg

Decision Making in Game of Chance

  • Chance nodes

    • Branches leading from each chance node denote the possible dice rolls

    • Labeled with the roll and the chance that it will occur

  • Replace MAX/MIN nodes in minimax with expected MAX/MIN payoff

    • Expectimax value of C

    • Expectimin value


Position evaluation in games with chance nodes l.jpg

Position evaluation in games with chance nodes

  • For minimax, any order-preserving transformation of the leaf values

    does not affect the choice of move

  • With chance node, some order-preserving transformations of the leaf values

    do affect the choice of move


Position evaluation in games with chance nodes cont d l.jpg

Position evaluation in games with chance nodes (cont’d)

  • The behavior of the algorithm is sensitive even to a linear transformation of the evaluation function.


Another example of expectimax l.jpg

Another Example of expectimax


Complexity of expectiminimax l.jpg

Complexity of expectiminimax

  • The expectiminimax considers all the possible dice-roll sequences

    • It takes O(bmnm)

      where n is the number of distinct rolls

    • Whereas, minimax takes O(bm)

  • Problems

    • The extra cost compared to minimax is very high

    • Alpha-beta pruning is more difficult to apply


Games in real life l.jpg

Games in real life


Context l.jpg

Context

Terrorists do a lot of different things

The U.S. can try and anticipate all kinds of things in defense of these attacks

If the U.S. fails to invest wisely, then we lose important battles.


A smallpox exercise l.jpg

A Smallpox Exercise

The U.S. government is concerned about the possibility of smallpox bioterrorism.

Terrorists could make no smallpox attack, a small attack on a single city, or coordinated attacks on multiple cities (or do other things).


Slide44 l.jpg

The U.S. has four defense strategies:

  • Stockpiling vaccine

  • Stockpiling and increasing bio-surveillance

  • Stockpiling and inoculating first responders and/or key personnel

  • Inoculating all consenting people with healthy immune systems.


Using game theory to make a decision l.jpg

Using Game Theory to make a decision

  • Classical game theory uses a matrix of costs to determine optimal play.

  • Optimal play is usually defined as a minimax strategy, but sometimes one can minimize expected loss instead.

  • Both methods are unreliable guides to human behavior.


Game theory matrix l.jpg

Game Theory Matrix


Minimax strategy l.jpg

Minimax Strategy

The U.S. should choose the defense with smallest row-wise max cost.

The terrorist should choose the attack with largest column-wise min cost.

If these are not equal then a randomized strategy is better.


Slide48 l.jpg

Extensive-form game theory invites decision theory criteria based upon minimum expected loss.

In our smallpox exercise, we shall implement this by assuming that the U.S. decisions are known to the terrorists, and that this affects their probabilities of using certain kinds of attacks.


Game theory critique l.jpg

Game Theory Critique

  • Game theory does not take account of resource limitations.

  • It assumes that both players have the same cost matrix.

  • It assumes both players act in synchrony (or in strict alternation).

  • It assumes all costs are measured without error.


Adding risk analysis l.jpg

Adding Risk Analysis

Statistical risk analysis makes probabilistic statements about specific kinds of threats.

It also treats the costs associated with threats as random variables. The total random cost is developed by analysis of component costs.


Cost example l.jpg

Cost Example

To illustrate a key idea, consider the problem of estimating the cost C11 in the game theory matrix. This is the cost associated with stockpiling vaccine when no smallpox attack occurs.

Some components of the cost are fixed, others are random.


Slide52 l.jpg

C11 = cost to test diluted Dryvax +

cost to test Aventis vaccine +

cost to make 209 x 106 doses +

cost to produce VIG +

logistic/storage/device costs.


Slide53 l.jpg

The other costs in the matrix are also random variables, and their distributions can be estimated in similar ways.

Note that different matrix costs are not independent; they often have components in common across rows and columns.


Slide54 l.jpg

More examples…

Cost to treat one smallpox case; this is normal with mean $200,000 and s.d. $50,000.

Cost to inoculate 25,000 people; this is normal with mean $60,000 and s.d. $10,000.

Economic costs of a single attack; this is gamma with mean $5 billion and s.d. $10 billion.


Games risk l.jpg

Games + Risk

Game theory and statistical risk analysis can be combined to give arguably useful guidance in threat management.

We generate many random tables, according to the risk analysis, and find which defenses are best.


Minimum expected loss l.jpg

Minimum Expected Loss

The table in the lower right shows the elicited probabilities of each kind of attack given that the corresponding defense has been adopted.

These probabilities are used to weight the costs in calculating the expected loss.


Conclusions l.jpg

Conclusions

For our rough risk analysis, minimax favors universal inoculation, minimum expected loss favors stockpiling.

This accords with the public and federal thinking on threat preparedness.


  • Login