- By
**Ava** - Follow User

- 548 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Adversarial Search' - Ava

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Search in an Adversarial Environment

- Iterative deepening and A* useful for single-agent search problems
- What if there are TWO agents?
- Goals in conflict:
- Adversarial Search

- Especially common in AI:
- Goals in direct conflict
- IE: GAMES.

Games vs. search problems

- "Unpredictable" opponent specifying a move for every possible opponent reply
- Time limits unlikely to find goal, must approximate
- Efficiency matters a lot
- HARD.
- In AI, typically "zero sum": one player wins exactly as much as other player loses.

Types of games

Deterministic Chance

Perfect Info Chess, Monopoly

Checkers Backgammon

Othello

Tic-Tac-Toe

Imperfect Info Bridge

Poker

Scrabble

Tic-Tac-Toe

- Tic Tac Toe is one of the classic AI examples. Let's play some.
- Tic Tac Toe version 1.
- http://www.ourvirtualmall.com/tictac.htm
- Tic Tac Toe version 2.
- http://thinks.com/java/tic-tac-toe/tic-tac-toe.htm
- Try them both, at various levels of difficulty.
- What kind of strategy are you using?
- What kind does the computer seem to be using?
- Did you win? Lose?

Problem Definition

- Formally define a two-person game as:
- Two players, called MAX and MIN.
- Alternate moves
- At end of game winner is rewarded and loser penalized.

- Game has
- Initial State: board position and player to go first
- Successor Function: returns (move, state) pairs
- All legal moves from the current state
- Resulting state

- Terminal Test
- Utility function for terminal states.

- Initial state plus legal moves define game tree.

Optimal Strategies

- Optimal strategy is sequence of moves leading to desired goal state.
- MAX's strategy is affected by MIN's play.
- So MAX needs a strategy which is the best possible payoff, assuming optimal play on MIN's part.
- Determined by looking at MINIMAX value for each node in game tree.

Minimax

- Perfect play for deterministic games
- Idea: choose move to position with highest minimax value = best achievable payoff against best play
- E.g., 2-ply game:

Properties of minimax

- Complete? Yes (if tree is finite)
- Optimal? Yes (against an optimal opponent)
- Time complexity? O(bm)
- Space complexity? O(bm) (depth-first exploration)
- For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible
- Even tic-tac-toe is much too complex to diagram here, although it's small enough to implement.

Pruning the Search

- “If you have an idea that is surely bad, don't take the time to see how truly awful it is.” -- Pat Winston
- Minimax exponential with # of moves; not feasible in real-life
- But we can PRUNE some branches.
- Alpha-Beta pruning
- If it is clear that a branch can't improve on the value we already have, stop analysis.

Properties of α-β

- Pruning does not affect final result
- Good move ordering improves effectiveness of pruning
- With "perfect ordering," time complexity = O(bm/2)
doubles depth of search which can be carried out for a given level of resources.

- A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than α, max will avoid it

prune that branch

Define β similarly for min

Why is it called α-β?The α-β algorithm found so far at any choice point along the path for

The α-β algorithm found so far at any choice point along the path for

"Informed" Search found so far at any choice point along the path for

- Alpha-Beta still not feasible for large game spaces.
- Can we improve on performance with domain knowledge?
- Yes -- if we have a useful heuristic for evaluating game states.
- Conceptually analogous to A* for single-agent search.

Resource limits found so far at any choice point along the path for

Suppose we have 100 secs, explore 104 nodes/sec106nodes per move

Standard approach:

- cutoff test:
e.g., depth limit (perhaps add quiescence search)

- evaluation function
= estimated desirability of position

Evaluation function found so far at any choice point along the path for

- Evaluation function or static evaluator is used to evaluate the “goodness” of a game position.
- Contrast with heuristic search where the evaluation function was a non-negative estimate of the cost from the start node to a goal and passing through the given node

- The zero-sum assumption allows us to use a single evaluation function to describe the goodness of a board with respect to both players.
- f(n) >> 0: position n good for me and bad for you
- f(n) << 0: position n bad for me and good for you
- f(n) near 0: position n is a neutral position
- f(n) = +infinity: win for me
- f(n) = -infinity: win for you

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Evaluation function examples found so far at any choice point along the path for

- Example of an evaluation function for Tic-Tac-Toe:
f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for you]

where a 3-length is a complete row, column, or diagonal

- Alan Turing’s function for chess
- f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and b(n) = sum of black’s

- Most evaluation functions are specified as a weighted sum of position features:
f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)

- Example features for chess are piece count, piece placement, squares controlled, etc.
- Deep Blue (which beat Gary Kasparov in 1997) had over 8000 features in its evaluation function

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Cutting off search found so far at any choice point along the path for

MinimaxCutoff is identical to MinimaxValue except

- Terminal? is replaced by Cutoff?
- Utility is replaced by Eval
Does it work in practice?

For chess: bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!

- 4-ply ≈ human novice
- 8-ply ≈ typical PC, human master
- 12-ply ≈ Deep Blue, Kasparov

Deterministic games in practice found so far at any choice point along the path for

- Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.
- Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.
- Othello: human champions refuse to compete against computers, who are too good.
- Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

Games of chance found so far at any choice point along the path for

- Backgammon is a two-player game with uncertainty.
- Players roll dice to determine what moves to make.
- White has just rolled 5 and 6 and has four legal moves:
- 5-10, 5-11
- 5-11, 19-24
- 5-10, 10-16
- 5-11, 11-16

- Such games are good for exploring decision making in adversarial problems involving skill and luck.

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Decision-Making in Non-Deterministic Games found so far at any choice point along the path for

- Probable state tree will depend on chance as well as moves chosen
- Add "chance" notes to the max and min nodes.
- Compute expected values for chance nodes.

Game Trees with Chance Nodes found so far at any choice point along the path for

- Chance nodes (shown as circles) represent random events
- For a random event with N outcomes, each chance node has N distinct children; a probability is associated with each
- (For 2 dice, there are 21 distinct outcomes)
- Use minimax to compute values for MAX and MIN nodes
- Use expected values for chance nodes
- For chance nodes over a max node, as in C:
- expectimax(C) = ∑i(P(di) * maxvalue(i))
- For chance nodes over a min node:
- expectimin(C) = ∑i(P(di) * minvalue(i))

Min

Rolls

Max

Rolls

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Meaning of the evaluation function found so far at any choice point along the path for

A1 is best move

A2 is best move

2 outcomes with prob {.9, .1}

- Dealing with probabilities and expected values means we have to be careful about the “meaning” of values returned by the static evaluator.
- Note that a “relative-order preserving” change of the values would not change the decision of minimax, but could change the decision with chance nodes.
- Linear transformations are OK

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Summary found so far at any choice point along the path for

- Games are fun to work on!
- They illustrate several important points about AI
- perfection is unattainable must approximate
- good idea to think about what to think about

Download Presentation

Connecting to Server..