Adversarial search l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 32

Adversarial Search PowerPoint PPT Presentation

Adversarial Search Chapter 6 Section 1 – 4 Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What if there are TWO agents? Goals in conflict: Adversarial Search Especially common in AI: Goals in direct conflict IE: GAMES.

Download Presentation

Adversarial Search

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Adversarial search l.jpg

Adversarial Search

Chapter 6

Section 1 – 4


Search in an adversarial environment l.jpg

Search in an Adversarial Environment

  • Iterative deepening and A* useful for single-agent search problems

  • What if there are TWO agents?

  • Goals in conflict:

    • Adversarial Search

  • Especially common in AI:

    • Goals in direct conflict

    • IE: GAMES.


Games vs search problems l.jpg

Games vs. search problems

  • "Unpredictable" opponent  specifying a move for every possible opponent reply

  • Time limits  unlikely to find goal, must approximate

  • Efficiency matters a lot

  • HARD.

  • In AI, typically "zero sum": one player wins exactly as much as other player loses.


Types of games l.jpg

Types of games

Deterministic Chance

Perfect Info Chess, Monopoly

Checkers Backgammon

Othello

Tic-Tac-Toe

Imperfect Info Bridge

Poker

Scrabble


Tic tac toe l.jpg

Tic-Tac-Toe

  • Tic Tac Toe is one of the classic AI examples. Let's play some.

  • Tic Tac Toe version 1.

  • http://www.ourvirtualmall.com/tictac.htm

  • Tic Tac Toe version 2.

  • http://thinks.com/java/tic-tac-toe/tic-tac-toe.htm

  • Try them both, at various levels of difficulty.

    • What kind of strategy are you using?

    • What kind does the computer seem to be using?

    • Did you win? Lose?


Problem definition l.jpg

Problem Definition

  • Formally define a two-person game as:

  • Two players, called MAX and MIN.

    • Alternate moves

    • At end of game winner is rewarded and loser penalized.

  • Game has

    • Initial State: board position and player to go first

    • Successor Function: returns (move, state) pairs

      • All legal moves from the current state

      • Resulting state

    • Terminal Test

    • Utility function for terminal states.

  • Initial state plus legal moves define game tree.


Tic tac toe game tree l.jpg

Tic Tac Toe Game tree


Optimal strategies l.jpg

Optimal Strategies

  • Optimal strategy is sequence of moves leading to desired goal state.

  • MAX's strategy is affected by MIN's play.

  • So MAX needs a strategy which is the best possible payoff, assuming optimal play on MIN's part.

  • Determined by looking at MINIMAX value for each node in game tree.


Minimax l.jpg

Minimax

  • Perfect play for deterministic games

  • Idea: choose move to position with highest minimax value= best achievable payoff against best play

  • E.g., 2-ply game:


Minimax algorithm l.jpg

Minimax algorithm


Properties of minimax l.jpg

Properties of minimax

  • Complete? Yes (if tree is finite)

  • Optimal? Yes (against an optimal opponent)

  • Time complexity? O(bm)

  • Space complexity? O(bm) (depth-first exploration)

  • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

  • Even tic-tac-toe is much too complex to diagram here, although it's small enough to implement.


Pruning the search l.jpg

Pruning the Search

  • “If you have an idea that is surely bad, don't take the time to see how truly awful it is.” -- Pat Winston

  • Minimax exponential with # of moves; not feasible in real-life

  • But we can PRUNE some branches.

  • Alpha-Beta pruning

    • If it is clear that a branch can't improve on the value we already have, stop analysis.


Pruning example l.jpg

α-β pruning example


Pruning example14 l.jpg

α-β pruning example


Pruning example15 l.jpg

α-β pruning example


Pruning example16 l.jpg

α-β pruning example


Pruning example17 l.jpg

α-β pruning example


Properties of l.jpg

Properties of α-β

  • Pruning does not affect final result

  • Good move ordering improves effectiveness of pruning

  • With "perfect ordering," time complexity = O(bm/2)

    doubles depth of search which can be carried out for a given level of resources.

  • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)


Why is it called l.jpg

α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than α, max will avoid it

 prune that branch

Define β similarly for min

Why is it called α-β?


The algorithm l.jpg

The α-β algorithm


The algorithm21 l.jpg

The α-β algorithm


Informed search l.jpg

"Informed" Search

  • Alpha-Beta still not feasible for large game spaces.

  • Can we improve on performance with domain knowledge?

  • Yes -- if we have a useful heuristic for evaluating game states.

  • Conceptually analogous to A* for single-agent search.


Resource limits l.jpg

Resource limits

Suppose we have 100 secs, explore 104 nodes/sec106nodes per move

Standard approach:

  • cutoff test:

    e.g., depth limit (perhaps add quiescence search)

  • evaluation function

    = estimated desirability of position


Evaluation function l.jpg

Evaluation function

  • Evaluation function or static evaluator is used to evaluate the “goodness” of a game position.

    • Contrast with heuristic search where the evaluation function was a non-negative estimate of the cost from the start node to a goal and passing through the given node

  • The zero-sum assumption allows us to use a single evaluation function to describe the goodness of a board with respect to both players.

    • f(n) >> 0: position n good for me and bad for you

    • f(n) << 0: position n bad for me and good for you

    • f(n) near 0: position n is a neutral position

    • f(n) = +infinity: win for me

    • f(n) = -infinity: win for you

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Evaluation function examples l.jpg

Evaluation function examples

  • Example of an evaluation function for Tic-Tac-Toe:

    f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for you]

    where a 3-length is a complete row, column, or diagonal

  • Alan Turing’s function for chess

    • f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and b(n) = sum of black’s

  • Most evaluation functions are specified as a weighted sum of position features:

    f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)

  • Example features for chess are piece count, piece placement, squares controlled, etc.

  • Deep Blue (which beat Gary Kasparov in 1997) had over 8000 features in its evaluation function

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Cutting off search l.jpg

Cutting off search

MinimaxCutoff is identical to MinimaxValue except

  • Terminal? is replaced by Cutoff?

  • Utility is replaced by Eval

    Does it work in practice?

    For chess: bm = 106, b=35  m=4

    4-ply lookahead is a hopeless chess player!

  • 4-ply ≈ human novice

  • 8-ply ≈ typical PC, human master

  • 12-ply ≈ Deep Blue, Kasparov


Deterministic games in practice l.jpg

Deterministic games in practice

  • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

  • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

  • Othello: human champions refuse to compete against computers, who are too good.

  • Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.


Games of chance l.jpg

Games of chance

  • Backgammon is a two-player game with uncertainty.

  • Players roll dice to determine what moves to make.

  • White has just rolled 5 and 6 and has four legal moves:

    • 5-10, 5-11

    • 5-11, 19-24

    • 5-10, 10-16

    • 5-11, 11-16

  • Such games are good for exploring decision making in adversarial problems involving skill and luck.

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Decision making in non deterministic games l.jpg

Decision-Making in Non-Deterministic Games

  • Probable state tree will depend on chance as well as moves chosen

  • Add "chance" notes to the max and min nodes.

  • Compute expected values for chance nodes.


Game trees with chance nodes l.jpg

Game Trees with Chance Nodes

  • Chance nodes (shown as circles) represent random events

  • For a random event with N outcomes, each chance node has N distinct children; a probability is associated with each

  • (For 2 dice, there are 21 distinct outcomes)

  • Use minimax to compute values for MAX and MIN nodes

  • Use expected values for chance nodes

  • For chance nodes over a max node, as in C:

  • expectimax(C) = ∑i(P(di) * maxvalue(i))

  • For chance nodes over a min node:

  • expectimin(C) = ∑i(P(di) * minvalue(i))

Min

Rolls

Max

Rolls

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Meaning of the evaluation function l.jpg

Meaning of the evaluation function

A1 is best move

A2 is best move

2 outcomes with prob {.9, .1}

  • Dealing with probabilities and expected values means we have to be careful about the “meaning” of values returned by the static evaluator.

  • Note that a “relative-order preserving” change of the values would not change the decision of minimax, but could change the decision with chance nodes.

  • Linear transformations are OK

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Summary l.jpg

Summary

  • Games are fun to work on!

  • They illustrate several important points about AI

  • perfection is unattainable  must approximate

  • good idea to think about what to think about


  • Login