Adversarial search
Download
1 / 32

Adversarial Search - PowerPoint PPT Presentation


  • 548 Views
  • Uploaded on

Adversarial Search Chapter 6 Section 1 – 4 Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What if there are TWO agents? Goals in conflict: Adversarial Search Especially common in AI: Goals in direct conflict IE: GAMES.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Adversarial Search' - Ava


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Adversarial search l.jpg

Adversarial Search

Chapter 6

Section 1 – 4


Search in an adversarial environment l.jpg
Search in an Adversarial Environment

  • Iterative deepening and A* useful for single-agent search problems

  • What if there are TWO agents?

  • Goals in conflict:

    • Adversarial Search

  • Especially common in AI:

    • Goals in direct conflict

    • IE: GAMES.


Games vs search problems l.jpg
Games vs. search problems

  • "Unpredictable" opponent  specifying a move for every possible opponent reply

  • Time limits  unlikely to find goal, must approximate

  • Efficiency matters a lot

  • HARD.

  • In AI, typically "zero sum": one player wins exactly as much as other player loses.


Types of games l.jpg
Types of games

Deterministic Chance

Perfect Info Chess, Monopoly

Checkers Backgammon

Othello

Tic-Tac-Toe

Imperfect Info Bridge

Poker

Scrabble


Tic tac toe l.jpg
Tic-Tac-Toe

  • Tic Tac Toe is one of the classic AI examples. Let's play some.

  • Tic Tac Toe version 1.

  • http://www.ourvirtualmall.com/tictac.htm

  • Tic Tac Toe version 2.

  • http://thinks.com/java/tic-tac-toe/tic-tac-toe.htm

  • Try them both, at various levels of difficulty.

    • What kind of strategy are you using?

    • What kind does the computer seem to be using?

    • Did you win? Lose?


Problem definition l.jpg
Problem Definition

  • Formally define a two-person game as:

  • Two players, called MAX and MIN.

    • Alternate moves

    • At end of game winner is rewarded and loser penalized.

  • Game has

    • Initial State: board position and player to go first

    • Successor Function: returns (move, state) pairs

      • All legal moves from the current state

      • Resulting state

    • Terminal Test

    • Utility function for terminal states.

  • Initial state plus legal moves define game tree.



Optimal strategies l.jpg
Optimal Strategies

  • Optimal strategy is sequence of moves leading to desired goal state.

  • MAX's strategy is affected by MIN's play.

  • So MAX needs a strategy which is the best possible payoff, assuming optimal play on MIN's part.

  • Determined by looking at MINIMAX value for each node in game tree.


Minimax l.jpg
Minimax

  • Perfect play for deterministic games

  • Idea: choose move to position with highest minimax value = best achievable payoff against best play

  • E.g., 2-ply game:



Properties of minimax l.jpg
Properties of minimax

  • Complete? Yes (if tree is finite)

  • Optimal? Yes (against an optimal opponent)

  • Time complexity? O(bm)

  • Space complexity? O(bm) (depth-first exploration)

  • For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

  • Even tic-tac-toe is much too complex to diagram here, although it's small enough to implement.


Pruning the search l.jpg
Pruning the Search

  • “If you have an idea that is surely bad, don't take the time to see how truly awful it is.” -- Pat Winston

  • Minimax exponential with # of moves; not feasible in real-life

  • But we can PRUNE some branches.

  • Alpha-Beta pruning

    • If it is clear that a branch can't improve on the value we already have, stop analysis.







Properties of l.jpg
Properties of α-β

  • Pruning does not affect final result

  • Good move ordering improves effectiveness of pruning

  • With "perfect ordering," time complexity = O(bm/2)

    doubles depth of search which can be carried out for a given level of resources.

  • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)


Why is it called l.jpg

α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than α, max will avoid it

 prune that branch

Define β similarly for min

Why is it called α-β?


The algorithm l.jpg
The α-β algorithm found so far at any choice point along the path for


The algorithm21 l.jpg
The α-β algorithm found so far at any choice point along the path for


Informed search l.jpg
"Informed" Search found so far at any choice point along the path for

  • Alpha-Beta still not feasible for large game spaces.

  • Can we improve on performance with domain knowledge?

  • Yes -- if we have a useful heuristic for evaluating game states.

  • Conceptually analogous to A* for single-agent search.


Resource limits l.jpg
Resource limits found so far at any choice point along the path for

Suppose we have 100 secs, explore 104 nodes/sec106nodes per move

Standard approach:

  • cutoff test:

    e.g., depth limit (perhaps add quiescence search)

  • evaluation function

    = estimated desirability of position


Evaluation function l.jpg
Evaluation function found so far at any choice point along the path for

  • Evaluation function or static evaluator is used to evaluate the “goodness” of a game position.

    • Contrast with heuristic search where the evaluation function was a non-negative estimate of the cost from the start node to a goal and passing through the given node

  • The zero-sum assumption allows us to use a single evaluation function to describe the goodness of a board with respect to both players.

    • f(n) >> 0: position n good for me and bad for you

    • f(n) << 0: position n bad for me and good for you

    • f(n) near 0: position n is a neutral position

    • f(n) = +infinity: win for me

    • f(n) = -infinity: win for you

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Evaluation function examples l.jpg
Evaluation function examples found so far at any choice point along the path for

  • Example of an evaluation function for Tic-Tac-Toe:

    f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for you]

    where a 3-length is a complete row, column, or diagonal

  • Alan Turing’s function for chess

    • f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and b(n) = sum of black’s

  • Most evaluation functions are specified as a weighted sum of position features:

    f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)

  • Example features for chess are piece count, piece placement, squares controlled, etc.

  • Deep Blue (which beat Gary Kasparov in 1997) had over 8000 features in its evaluation function

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Cutting off search l.jpg
Cutting off search found so far at any choice point along the path for

MinimaxCutoff is identical to MinimaxValue except

  • Terminal? is replaced by Cutoff?

  • Utility is replaced by Eval

    Does it work in practice?

    For chess: bm = 106, b=35  m=4

    4-ply lookahead is a hopeless chess player!

  • 4-ply ≈ human novice

  • 8-ply ≈ typical PC, human master

  • 12-ply ≈ Deep Blue, Kasparov


Deterministic games in practice l.jpg
Deterministic games in practice found so far at any choice point along the path for

  • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

  • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

  • Othello: human champions refuse to compete against computers, who are too good.

  • Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.


Games of chance l.jpg
Games of chance found so far at any choice point along the path for

  • Backgammon is a two-player game with uncertainty.

  • Players roll dice to determine what moves to make.

  • White has just rolled 5 and 6 and has four legal moves:

    • 5-10, 5-11

    • 5-11, 19-24

    • 5-10, 10-16

    • 5-11, 11-16

  • Such games are good for exploring decision making in adversarial problems involving skill and luck.

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Decision making in non deterministic games l.jpg
Decision-Making in Non-Deterministic Games found so far at any choice point along the path for

  • Probable state tree will depend on chance as well as moves chosen

  • Add "chance" notes to the max and min nodes.

  • Compute expected values for chance nodes.


Game trees with chance nodes l.jpg
Game Trees with Chance Nodes found so far at any choice point along the path for

  • Chance nodes (shown as circles) represent random events

  • For a random event with N outcomes, each chance node has N distinct children; a probability is associated with each

  • (For 2 dice, there are 21 distinct outcomes)

  • Use minimax to compute values for MAX and MIN nodes

  • Use expected values for chance nodes

  • For chance nodes over a max node, as in C:

  • expectimax(C) = ∑i(P(di) * maxvalue(i))

  • For chance nodes over a min node:

  • expectimin(C) = ∑i(P(di) * minvalue(i))

Min

Rolls

Max

Rolls

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Meaning of the evaluation function l.jpg
Meaning of the evaluation function found so far at any choice point along the path for

A1 is best move

A2 is best move

2 outcomes with prob {.9, .1}

  • Dealing with probabilities and expected values means we have to be careful about the “meaning” of values returned by the static evaluator.

  • Note that a “relative-order preserving” change of the values would not change the decision of minimax, but could change the decision with chance nodes.

  • Linear transformations are OK

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt


Summary l.jpg
Summary found so far at any choice point along the path for

  • Games are fun to work on!

  • They illustrate several important points about AI

  • perfection is unattainable  must approximate

  • good idea to think about what to think about


ad