adversarial search games n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Adversarial Search: Games PowerPoint Presentation
Download Presentation
Adversarial Search: Games

Loading in 2 Seconds...

play fullscreen
1 / 48

Adversarial Search: Games - PowerPoint PPT Presentation


  • 444 Views
  • Uploaded on

Adversarial Search: Games. Chapter 6 Feb 21 2007. What are, why study games?. Games are a form of multi-agent environment What do other agents do and how do they affect our success? Competitive multi-agent environments give rise to adversarial search ( games ) Why study games?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Adversarial Search: Games' - lotus


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
adversarial search games

Adversarial Search: Games

Chapter 6

Feb 21 2007

what are why study games
What are, why study games?
  • Games are a form of multi-agent environment
    • What do other agents do and how do they affect our success?
    • Competitive multi-agent environments give rise to adversarial search (games)
  • Why study games?
    • Fun; historically entertaining
    • Interesting because they are hard
    • Easy to represent,agents restricted to small number of actions
games vs search
Games vs. Search
  • Search – no adversary
    • Solution is (heuristic) method for finding goal
    • Heuristics and CSP techniques can find optimal solution
    • Evaluation function: estimate of cost from start to goal through given node
    • Examples: path planning, scheduling activities
  • Games – adversary
    • Solution is a strategy: specifies move for every possible opponent reply
    • Time limits force an approximate solution
    • Evaluation function: “goodness” of game position
    • Examples: chess, checkers, Othello, backgammon
historical contributions
Historical Contributions
  • computer considers possible lines of play
    • Babbage, 1846
  • algorithm for perfect play
    • Zermelo, 1912, Von Neumann, 1944
  • finite horizon, approximate evaluation
    • Zuse, 1945, Wiener, 1948, Shannon, 1950
  • first chess program
    • Turing, 1951
  • machine learning to improve evaluation accuracy
    • Samuel, 1952-57
  • pruning to allow deeper search
    • McCarthy, 1956
types of games
Types of Games

deterministic

chance

perfectinformation

imperfectinformation

game setup
Game setup
  • Two players: MAX and MIN
  • MAX moves first, take turns until the game is over
  • winner gets award, loser gets penalty
  • Games as search:
    • Initial state: e.g. board configuration of chess
    • Successor function: list of (move,state) pairs specifying legal moves.
    • Terminal test: Is the game finished?
    • Utility function: numerical value of terminal states. e.g. win (+1), loose (-1) and draw (0) in tic-tac-toe
  • MAX uses search tree to determine next move
example game tree
Example Game Tree

ply

move

ply

ply = half-move = one player’s turn

optimal strategies
Optimal strategies
  • Find the contingent strategy for MAX assuming an infallible MIN opponent.
  • Assumption: Both players play optimally !!
  • Given a game tree, the optimal strategy can be determined by using the minimax value of each node:

MINIMAX-VALUE(n)=

UTILITY(n) If n is a terminal

maxs  successors(n) MINIMAX-VALUE(s) If n is a max node

mins  successors(n) MINIMAX-VALUE(s) If n is a min node

two ply game tree2

The minimax decision

Two-Ply Game Tree

Minimax maximizes the worst-case outcome for max.

what if min does not play optimally
What if MIN does not play optimally?
  • Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX.
  • But if MIN does not play optimally, MAX will do even better. [proven.]
minimax algorithm
Minimax Algorithm

function MINIMAX-DECISION(state) returns an action

inputs: state, current state in game

vMAX-VALUE(state)

return the action in SUCCESSORS(state) with value v

function MAX-VALUE(state) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)

v  ∞

for a,s in SUCCESSORS(state) do

v MAX(v,MIN-VALUE(s))

return v

function MIN-VALUE(state) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)

v  ∞

for a,s in SUCCESSORS(state) do

v MIN(v,MAX-VALUE(s))

return v

properties of minimax
Properties of minimax
  • Complete? Yes (if tree is finite)
  • Optimal? Yes (against an optimal opponent)
  • Time complexity? O(bm)
  • Space complexity? O(bm) (depth-first)
  • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible
multiplayer games
Multiplayer games
  • Games allow more than two players
  • Single minimax values become vectors
pruning
Pruning
  • Minimax problem: number of games states is exponential in the number of moves.
  • Alpha-beta pruning
    • Remove branches that don’t influence final decision

Range of possible values

[-∞,+∞]

[-∞, +∞]

alpha beta example continued3
Alpha-Beta Example (continued)

[3,+∞]

This node is worse

for MAX

[3,3]

[-∞,2]

prune this subtree, already determined to be worse choice than left subtree

alpha beta example continued4
Alpha-Beta Example (continued)

,

[3,14]

[3,3]

[-∞,2]

[-∞,14]

alpha beta example continued5
Alpha-Beta Example (continued)

,

[3,5]

[3,3]

[−∞,2]

[-∞,5]

alpha beta example continued6
Alpha-Beta Example (continued)

[3,3]

[2,2]

[3,3]

[−∞,2]

alpha beta example continued7
Alpha-Beta Example (continued)

[3,3]

[2,2]

[3,3]

[-∞,2]

alpha beta algorithm
Alpha-Beta Algorithm

function ALPHA-BETA-SEARCH(state) returns an action

inputs: state, current state in game

vMAX-VALUE(state, - ∞ , +∞)

return the action in SUCCESSORS(state) with value v

function MAX-VALUE(state, , ) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)

v  - ∞

for a,s in SUCCESSORS(state) do

v MAX(v,MIN-VALUE(s,  , ))

ifv ≥ then returnv

 MAX( ,v)

return v

alpha beta algorithm1
Alpha-Beta Algorithm

function MIN-VALUE(state,  , ) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)

v  + ∞

for a,s in SUCCESSORS(state) do

v MIN(v,MAX-VALUE(s,  , ))

ifv ≤ then returnv

 MIN( ,v)

return v

general alpha beta pruning
General alpha-beta pruning
  • Consider a node n somewhere in the tree
  • If player has a better choice at
    • Parent node of n
    • Or any choice point further up
  • n will never be reached in actual play.
  • Hence when enough is known about n, it can be pruned.
why is it called
α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than α, max will avoid it

 prune that branch

Define β similarly for min

Why is it called α-β?
final comments about alpha beta pruning
Final Comments about Alpha-Beta Pruning
  • Pruning does not affect final results
  • Entire subtrees can be pruned
  • Good move ordering improves effectiveness of pruning
  • With “perfect ordering,” time complexity is O(bm/2)
    • Branching factor of sqrt(b) !!
    • Alpha-beta pruning can look twice as far as minimax in the same amount of time
  • Repeated states are again possible.
    • Store them in memory = transposition table
  • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)
resource limits
Resource Limits
  • Minimax and alpha-beta pruning require too much leaf-node evaluation.
  • May be impractical within a reasonable amount of time.
  • SHANNON (1950):
    • Cut off search earlier (replace TERMINAL-TEST by CUTOFF-TEST)
    • Apply heuristic evaluation function EVAL (replacing utility function of alpha-beta)
cutting off search
Cutting off search
  • Change:
    • if TERMINAL-TEST(state) then return UTILITY(state)

into

    • if CUTOFF-TEST(state,depth) then return EVAL(state)
  • Introduces a fixed-depth limit depth
    • Is selected so that the amount of time will not exceed what the rules of the game allow.
  • When cutoff occurs, the evaluation is performed.
cutting off search1
Cutting off search

Does it work in practice?

bm = 106, b=35  m=4

4-ply lookahead is a hopeless chess player!

  • 4-ply ≈ human novice
  • 8-ply ≈ typical PC, human master
  • 12-ply ≈ Deep Blue, Kasparov
heuristic eval
Heuristic EVAL
  • Idea: produce an estimate of the expected utility of the game from a given position
  • Performance depends on quality of EVAL
  • Requirements:
    • EVAL should order terminal-nodes in the same way as UTILITY
    • Computation must not take too long
    • For non-terminal states the EVAL should be strongly correlated with the actual chance of winning
  • Only useful for quiescent states(no wild swings in value in near future)
heuristic eval example
Heuristic EVAL example

For chess, typically a linear weighted sum of features

e.g., w1 = 9 with

f1(s) = (number of white queens) – (number of black queens)

Eval(s) = w1 f1(s) + w2 f2(s) + … + wnfn(s)

heuristic difficulties
Heuristic difficulties

Heuristic counts pieces won

horizon effect
Horizon effect

Fixed depth search

thinks it can avoid

the queening move

deterministic games in practice
Deterministic games in practice
  • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.
  • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.
  • Othello: human champions refuse to compete against computers, who are too good.
  • Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.
games that include chance
Games that include chance

chance nodes

  • Possible moves:(5-10,5-11), (5-11,19-24),(5-10,10-16) and (5-11,11-16)
  • [1,1], [6,6] chance 1/36, all other chance 1/18
  • Cannot calculate definite minimax value, only expected value
expected minimax value
Expected minimax value

EXPECTED-MINIMAX-VALUE(n)=

UTILITY(n) If n is a terminal

maxs  successors(n) MINIMAX-VALUE(s) If n is a max node

mins  successors(n) MINIMAX-VALUE(s) If n is a max node

s  successors(n) P(s) . EXPECTEDMINIMAX(s) If n is a chance node

These equations can be backed-up recursively all the way to the root of the game tree.

position evaluation with chance nodes
Position evaluation with chance nodes

same ordering, different scaling

  • Left, A1 wins
  • Right A2 wins
  • Outcome of evaluation function mustnot change when values are scaled differently.
  • Behavior is preserved only by a positive linear transformation of EVAL.
slide43

MIN wins trick

MIN wins trick

MAX wins remainingtricks  even score

slide44

same result

different card

slide45

uncertaincard

at this point, MAX has 50% chance of losing

summary
Summary
  • Games illustrate several important points about AI
    • Perfection is unattainable  approximation
    • Good idea what to think about what to think about
    • Uncertainty constrains the assignment of values to states
  • Games : AI as grand prix racing : automobile design