- 823 Views
- Updated On :
- Presentation posted in: Sports / GamesEducation / CareerFashion / BeautyGraphics / DesignNews / Politics

Minimax search algorithm For now, we assume that exhaustive search is possible. Generate the entire game tree. Assume it has depth d . For each terminal state, apply the payoff function to get its score.

Minimax search algorithm

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

For now, we assume that exhaustive search is possible.

- Generate the entire game tree. Assume it has depth d.
- For each terminal state, apply the payoff function to get its score.
- Back-up the scores at level d to assign a score to each node at level d-1. If level d is MAX's move, then select the best score from the node's children. If level d is MIN's move, then select the worst score from the node's children.
- Backup the scores all the way up the tree, until the root node gets to choose the maximum score among its children. This is the minimax decision that determines the best move to make!

- If we do not reach the end of the game how do we evaluate the payoff of the leaf states?
- Use a static evaluation function.
- A heuristic function that estimates the utility of board positions.
- Desirable properties
- Must agree with the utility function
- Must not take too long to evaluate
- Must accurately reflect the chance of winning

- An ideal evaluation function can be applied directly to the board position.
- It is better to apply it as many levels down in the game tree as time permits

- Relative material value
- Pawn = 1, knight = 3, bishop = 3, rook = 5, queen = 9

- Good pawn structure
- King safety

For the MAX player

- Generate the game as deep as time permits
- Apply the evaluation function to the leaf states
- Back-up values
- At MIN ply assign minimum payoff move
- At MAX ply assign maximum payoff move

- At root, MAX chooses the operator that led to the highest payoff

- minimax(board, depth, type)
- If depth = 0 return Eval-Fn(board)
- else if type = max
- cur-max = -inf
- loop for b in succ(board)
- b-val = minimax(b,depth-1,min)
- cur-max = max(b-val,cur-max)

- return cur-max

- else (type = min)
- cur-min = inf
- loop for b in succ(board)
- b-val = minimax(b,depth-1,max)
- cur-min = min(b-val,cur-min)

- return cur-min

max

min

max

min

max

10

min

10

2

max

10

14

2

24

min

10

9

14

13

2

1

3

24

Most interesting games cannot be searched exhaustively, so a fixed depth cutoff must be applied. But this can cause problems ...

Quiescence: If you arbitrarily apply the evaluation function at a fixed depth, you might miss a huge swing that is about to happen. The evaluation function should only be applied to quiescent (stable) positions. (Requires game knowledge!)

The Horizon Effect: Search has to stop somewhere, but a huge change might be lurking just over the horizon. There is no general fix but heuristics can sometimes help.

- Suppose your program can search 1000 positions/second.
- In chess, you get roughly 150 seconds per move so you can search 150,000 positions.
- Since chess has a branching factor of about 35, your program can only search 34 ply!
- An average human plans 68 moves ahead so your program will act like a novice.
- Fortunately, we can often avoid searching parts of the game tree by keeping track of the best and worst alternatives at each point. This is called pruning the search tree.

- Alpha-beta pruning is used on top of minimax search to detect paths that do not need to be explored. The intuition is:
- The MAX player is always trying to maximize the score. Call this .
- The MIN player is always trying to minimize the score. Call this .
- When a MIN node has <= the of its MAX ancestors, then this path will never be taken. (MAX has a better option.) This is called an -cutoff.
- When a MAX node has >= the of its MIN ancestors, then this path will never be taken. (MIN has a better option.) this is called an -cutoff

The minimax procedure explores every path of length depth. Can we do less work?

A

MAX

B

C

D

MIN

E

F

G

H

I

J

K

L

(3)

A

MAX

B (3)

C

D

MIN

E (3)

F (12)

J

K

L

G (8)

H

I

(3)

A

MAX

B (3)

C (<-5)

D

MIN

E (3)

F (12)

J

K

L

G (8)

H (-5)

I

A (3)

MAX

B (3)

C (<-5)

D (2)

MIN

E (3)

F (12)

J (15)

K (5)

L (2)

G (8)

H (-5)

I

- Pruning: eliminating a branch of the search tree from consideration without exhaustive examination of each node
- - pruning:the basic idea is to prune portions of the search tree that cannot improve the utility value of the max or min node, by just considering the values of nodes seen so far.
- Does it work? Yes, in roughly cuts the branching factor from b to b resulting in double as far look-ahead than pure minimax

6

MAX

MIN

6

6

12

8

6

MAX

MIN

6

2

6

12

8

2

6

MAX

MIN

5

6

2

6

12

8

2

5

6

MAX

Selected move

MIN

5

6

2

6

12

8

2

5

Player

m

Opponent

If > v then MAX will chose m so prune tree under n

Similar for for MIN

Player

n

v

Opponent

- Alpha-beta pruning is guaranteed to find the same best move as the minimax algorithm by itself, but can drastically reduce the number of nodes that need to be explored.
- The order in which successors are explored can make a dramatic difference!
- In the optimal situation, alpha-beta pruning only needs to explore O(bd/2 ) nodes.
- Minimax search explores O(b d ) nodes, so alpha-beta pruning can afford to double the search depth!
- If successors are explored randomly, alpha-beta explores about O(b3d/4).
- In practice, heuristics often allow performance to be closer to the best-case scenario.

procedure alpha-beta-max(node, , )

if leaf node(node) then return evaluation(node);

foreach (successor s of node)

:= max(,alpha-beta-min(s, , ));

if >= then return ;

return ;

procedure alpha-beta-min(node, , )

if leaf node(node) then return evaluation(node);

foreach (successor s of node)

:= min(,alpha-beta-max(s, , );

if <= then return ;

return ;

To begin, we invoke: alpha-beta-max(node,1,1)

- Pruning does not affect final result
- Alpha-beta pruning
- Asymptotic time complexity
- O((b/log b)d)

- With “perfect ordering,” time complexity
- O(bd/2)
- means we go from an effective branching factor of b to sqrt(b) (e.g. 35 -> 6).

- Asymptotic time complexity

- minimax-a-b(board, depth, type, a, b)
- If depth = 0 return Eval-Fn(board)
- else if type = max
- cur-max = -inf
- loop for b in succ(board)
- b-val = minimax-a-b(b,depth-1,min, a, b)
- cur-max = max(b-val,cur-max)
a= max(cur-max, a)

if cur-max >= b finish loop

- return cur-max

- else type = min
- cur-min = inf
- loop for b in succ(board)
- b-val = minimax-a-b(b,depth-1,max, a, b)
- cur-min = min(b-val,cur-min)
b= min(cur-min, b)

if cur-min <= a finish loop

- return cur-min

max

min

max

min

max

10

min

10

4

max

4

10

14

min

10

2

9

14

4

Max

Min

Max

Min

A (3)

B (3)

C (<-5)

D (2)

E

(3)

F

(12)

J

(15)

K

(5)

L

(2)

G

(8)

H

(-5)

I

Good move ordering improves effectiveness of pruning

MAX

A (3)

B (3)

C (<-5)

D (<2)

MIN

E

(3)

F

(12)

G

(8)

H

(-5)

I

L

(2)

K

(5)

J

(15)

Original Ordering

Better Ordering

- Use catalogue of “solved” positions to extract the correct move.
- For complicated games, such catalogues are not available for all positions
- Often, sections of the game are well-understood and catalogued
- E.g. openings and endings in chess

- Combine knowledge (book moves) with search (minimax) to produce better results.

- http://www.gametheory.net/applets/

- How to include
chance –

Add chance node

- Chance nodes
- Branches leading from each chance node denote the possible dice rolls
- Labeled with the roll and the chance that it will occur

- Replace MAX/MIN nodes in minimax with expected MAX/MIN payoff
- Expectimax value of C
- Expectimin value

- For minimax, any order-preserving transformation of the leaf values
does not affect the choice of move

- With chance node, some order-preserving transformations of the leaf values
do affect the choice of move

- The behavior of the algorithm is sensitive even to a linear transformation of the evaluation function.

- The expectiminimax considers all the possible dice-roll sequences
- It takes O(bmnm)
where n is the number of distinct rolls

- Whereas, minimax takes O(bm)

- It takes O(bmnm)
- Problems
- The extra cost compared to minimax is very high
- Alpha-beta pruning is more difficult to apply

Games in real life

Terrorists do a lot of different things

The U.S. can try and anticipate all kinds of things in defense of these attacks

If the U.S. fails to invest wisely, then we lose important battles.

The U.S. government is concerned about the possibility of smallpox bioterrorism.

Terrorists could make no smallpox attack, a small attack on a single city, or coordinated attacks on multiple cities (or do other things).

The U.S. has four defense strategies:

- Stockpiling vaccine
- Stockpiling and increasing bio-surveillance
- Stockpiling and inoculating first responders and/or key personnel
- Inoculating all consenting people with healthy immune systems.

- Classical game theory uses a matrix of costs to determine optimal play.
- Optimal play is usually defined as a minimax strategy, but sometimes one can minimize expected loss instead.
- Both methods are unreliable guides to human behavior.

The U.S. should choose the defense with smallest row-wise max cost.

The terrorist should choose the attack with largest column-wise min cost.

If these are not equal then a randomized strategy is better.

Extensive-form game theory invites decision theory criteria based upon minimum expected loss.

In our smallpox exercise, we shall implement this by assuming that the U.S. decisions are known to the terrorists, and that this affects their probabilities of using certain kinds of attacks.

- Game theory does not take account of resource limitations.
- It assumes that both players have the same cost matrix.
- It assumes both players act in synchrony (or in strict alternation).
- It assumes all costs are measured without error.

Statistical risk analysis makes probabilistic statements about specific kinds of threats.

It also treats the costs associated with threats as random variables. The total random cost is developed by analysis of component costs.

To illustrate a key idea, consider the problem of estimating the cost C11 in the game theory matrix. This is the cost associated with stockpiling vaccine when no smallpox attack occurs.

Some components of the cost are fixed, others are random.

C11 = cost to test diluted Dryvax +

cost to test Aventis vaccine +

cost to make 209 x 106 doses +

cost to produce VIG +

logistic/storage/device costs.

The other costs in the matrix are also random variables, and their distributions can be estimated in similar ways.

Note that different matrix costs are not independent; they often have components in common across rows and columns.

More examples…

Cost to treat one smallpox case; this is normal with mean $200,000 and s.d. $50,000.

Cost to inoculate 25,000 people; this is normal with mean $60,000 and s.d. $10,000.

Economic costs of a single attack; this is gamma with mean $5 billion and s.d. $10 billion.

Game theory and statistical risk analysis can be combined to give arguably useful guidance in threat management.

We generate many random tables, according to the risk analysis, and find which defenses are best.

The table in the lower right shows the elicited probabilities of each kind of attack given that the corresponding defense has been adopted.

These probabilities are used to weight the costs in calculating the expected loss.

For our rough risk analysis, minimax favors universal inoculation, minimum expected loss favors stockpiling.

This accords with the public and federal thinking on threat preparedness.