Loading in 2 Seconds...

Finding equilibria in large sequential games of imperfect information

Loading in 2 Seconds...

- 412 Views
- Uploaded on

Download Presentation
## Finding equilibria in large sequential games of imperfect information

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Finding equilibria in large sequential games of imperfect information

Andrew Gilpin and Tuomas Sandholm

Carnegie Mellon University

Computer Science Department

Motivation: Poker

- Poker is a wildly popular card game
- This year’s World Series of Poker prize pool surpassed $103 million, including $56 million for the World Championship event
- ESPN is broadcasting parts of the tournament
- Poker presents several challenges for AI
- Imperfect information
- Risk assessment and management
- Deception (bluffing, slow-playing)
- Counter-deception (calling a bluff)

The Deal

Round 1

Round 2

Round 3

Showdown

Sneak preview of results:Solving Rhode Island Hold’em poker

- Rhode Island Hold’em poker invented as a testbed for AI research [Shi & Littman 2001]
- Game tree has more than 3.1 billion nodes
- Previously, the best techniques did not scale to games this large
- Using our algorithm we have computed optimal strategies for this game
- This is the largest poker game solved to date by over four orders of magnitude

Outline of this talk

- Game-theoretic foundations: Equilibrium
- Model: Ordered games
- Abstraction mechanism: Information filters
- Strategic equivalence: Game isomorphisms
- Algorithm: GameShrink
- Solving Rhode Island Hold’em

Game Theory

- In multi-agent systems, an agent’s outcome depends on the actions of the other agents
- Consequently, an agent’s optimal action depends on the actions of the other agents
- Game theory provides guidance as to how an agent should act
- A game-theoretic equilibrium specifies a strategy for each agent such that no agent wishes to deviate
- Such an equilibrium always exists [Nash 1950]

Complexity of computing equilibria

- Finding a Nash equilibrium is “A most fundamental computational problem whose complexity is wide open [and] together with factoring … the most important concrete open question on the boundary of P today” [Papadimitriou 2001]
- Even for games with only two players
- There are algorithms (requiring exponential-time in the worst-case) for computing Nash equilibria
- Good news: Two-person zero-sum matrix games can be solved in poly-time using linear programming

What about sequential games?

- Sequential games involve turn-taking, moves of chance, and imperfect information
- Every sequential game can be converted into a simultaneous-move game
- Basic idea: Make one strategy in the simultaneous-move game for every possible action in every possible situation in the sequential game
- This approach leads to an exponential blowup in the number of strategies

Sequence form representation

- The sequence form is an alternative representation that is more compact [Koller, Megiddo, von Stengel, Romanovskii]
- Using the sequence form, two-player zero-sum games with perfect recall can be solved in time polynomial in the size of the game tree
- But, Texas Hold’em has 1018 nodes

Our approach

- Instead of developing an equilibrium-finding algorithm per se, we instead introduce an automated abstraction technique that results in a smaller, equivalent game
- We prove that a Nash equilibrium in the smaller game corresponds to a Nash equilibrium in the original game
- Our technique applies to n-player sequential games with observed actions and ordered signals

κ = (0,1,1)

γ = (1,0,0)

Θ = {2♠,…,A♦}

Uniform

Hand rank

Game with ordered signals(a.k.a. ordered game)- Players I = {1,…,n}
- Stage games G = G1,…,Gr
- Player label L
- Game-ending nodes ω
- Signal alphabet Θ
- Signal quantities κ = κ1,…,κr and γ = γ1,…,γr
- Signal probability distribution p
- Partial ordering ≥ of subsets of Θ
- Utility function u (increasing in private signals)

Information filters

- Observation: We can make games smaller by filtering the information a player receives
- Instead of observing a specific signal exactly, a player instead observes a filtered set of signals
- E.g. receiving the signal {A♠,A♣,A♥,A♦} instead of A♠
- Combining an ordered game and a valid information filter yields a filtered ordered game
- Prop.A filtered ordered game is a finite sequential game with perfect recall
- CorollaryIf the filtered ordered game is two-person zero-sum, we can solve it in poly-time using linear programming

Filtered signal trees

- Every filtered ordered game has a corresponding filtered signal tree
- Each edge corresponds to the revelation of some signal
- Each path corresponds to the revelation of a set of signals
- Our algorithms operate directly on the filtered signal tree
- We never load the full game representation into memory

Ordered game isomorphic relation

- The ordered game isomorphic relation captures the notion of strategic symmetry between nodes
- We define the relationship recursively:
- Two leaves are ordered game isomorphic if the payoffs to all players are the same at each leaf, for all action histories
- Two internal nodes are ordered game isomorphic if they are siblings and there is a bijection between their children such that only ordered game isomorphic nodes are matched
- We can compute this relationship efficiently using dynamic programming and perfect matching computations in a bipartite graph

Ordered game isomorphic abstraction transformation

- This operation transforms an existing information filter into a new filter that merges two ordered game isomorphic nodes
- The new filter yields a smaller, abstracted game
- ThmIf a strategy profile is a Nash equilibrium in the smaller, abstracted game, then it is a Nash equilibrium in the original game

GameShrink: Efficiently computing ordered game isomorphic abstraction transformations

- Recall: we have a dynamic program for determining if two nodes of the filtered signal tree are ordered game isomorphic
- Algorithm: Starting from the top of the filtered signal tree, perform the transformation where applicable
- Approximation algorithm: instead of requiring perfect matching, instead require a matching with a penalty below some threshold

GameShrink: Efficiently computing ordered game isomorphic abstraction transformations

- The Union-Find data structure provides an efficient representation of the information filter
- Linear memory and almost linear time
- Can eliminate certain perfect matching computations by using easy-to-check necessary conditions
- Compact histogram databases for storing win/loss frequencies to speed up the checks

Solving Rhode Island Hold’em poker

- GameShrink computes all ordered game isomorphic abstraction transformations in under one second
- Without abstraction, the linear program has 91,224,226 rows and columns
- After applying GameShrink, the linear program has only 1,237,238 rows and columns
- By solving the resulting linear program, we are able to compute optimal min-max strategies for this game
- CPLEX Barrier method takes 7 days, 17 hours and 25 GB RAM to solve
- This is the largest poker game solved to date by over four orders of magnitude

Comparison to previous research

- Rule-based
- Limited success in even small poker games
- Simulation/Learning
- Do not take multi-agent aspect into account
- Game-theoretic
- Manual abstraction
- “Approximating Game-Theoretic Optimal Strategies for Full-scale Poker”, Billings, Burch, Davidson, Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03. Distinguished Paper Award.
- Automated abstraction

Directions for future work

- Computing strategies for larger games
- Requires approximation of solutions
- Tournament poker
- More than two players
- Other types of abstraction

Summary

- Introduced an automatic method for performing abstractions in a broad class of games
- Introduced information filters as a technique for working with games with imperfect information
- Developed an equilibrium-preserving abstraction transformation, along with an efficient algorithm
- Described a simple extension that yields an approximation algorithm for tackling even larger games
- Solved the largest poker game to date
- Playable on-line at http://www.cs.cmu.edu/~gilpin/gsi.html

Thank you very much for your interest

Download Presentation

Connecting to Server..