1 / 35

Temporal Action-Graph Games: A New Representation for Dynamic Games

Temporal Action-Graph Games: A New Representation for Dynamic Games. Albert Xin Jiang University of British Columbia. Kevin Leyton-Brown University of British Columbia. Avi Pfeffer Charles River Analytics. Game Theory In One Slide . A game:

gaylord
Download Presentation

Temporal Action-Graph Games: A New Representation for Dynamic Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Temporal Action-Graph Games:A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British Columbia Avi Pfeffer Charles River Analytics

  2. Game Theory In One Slide  • A game: • an interaction between two or more self-interested agents • each agent independently chooses a strategy • each agent derives utility from the resulting strategy profile • Strategies: • simultaneous-move games: choosing from a set of actions • dynamic games: choosing actions at multiple points in time; conditioned on observations • can randomize over actions • Reasoning about games: • Often involves computation of solution concepts e.g. Nash equilibrium

  3. Representations of Games

  4. Representations of Games

  5. Representations of Games

  6. Representations of Games

  7. Overview • AGGs • TAGGs • Computation • Experiments

  8. AGGs • played on a set of action nodes • each agent chooses an action node from a subset of action nodes • for each action node, an action count is tallied • utility dependence expressed by the action graph • utility of an agent depends only on • action chosen by the agent • action counts on the neighbors of the chosen action (configuration) • representation is compact when action graph has small in-degrees

  9. Example: simultaneous-move tollbooth game L1 L2 L3 • tollbooth with 3 lanes • 5 cars arrive • cars choose lanes simultaneously • utility depends on number of cars choosing same lane • context-specific independence (CSI): different independencies under diff context (player’s own action) • anonymity: utility depends on the numbers of agents taking certain actions, not their identities

  10. Example: Dynamic Tollbooth Game 0 2 3 0 0 0 3 4 3 • tollbooth with 3 lanes • 20 cars arrive in 4 waves of 5 cars each • in each wave, cars choose lanes simultaneously • driver can observe number of cars in each lane • utility depends on number of cars choosing same lane, either before him or at the same time • Extending AGGs to multiple time steps: • Action counts accumulate over time • Need to be able to specify agents’ observations • Model uncertainty using chance variables

  11. Defining TAGGs • A TAGG is a tuple • N: set of players • T: duration • set of actions • set of chance variables • set of decisions • set of utility functions

  12. TAGGs • A decision D • Action set: a subset of • Observation set O[D]: of actions, decisions and chance vars • Set of payoff times pt(D) • Utility function • One for each action at each time step • Set of parents • Utility depends only on its parents’ instantiation at time • Evaluated at payoff times of decisions

  13. Strategies • A behavior strategy at decision D is a mapping from an instantiation of O[D] at time t(D)-1 to a distribution over its action set. • A behavior strategy for player i is a tuple consisting of a behavior strategy for each of her decisions • Behavior strategy profile: tuple of behavior strategies for all players

  14. Induced Bayes Net • Induced BN of a TAGG given a strategy profile • Formally describes how the TAGG is played out • Decisions, chance variables and utilities correspond to random variables in the BN • Action counts are time-dependent: we have a separate action count variable for each action at each time step • Decision-payoff variable as utility of decision D received at payoff time • Expected utility of a player is the sum of the expected values of her decisions’ decision-payoff variables. • Can similarly define induced MAID of a TAGG

  15. Induced BN / MAID: tollbooth example

  16. TAGGs and MAIDs • Any TAGG is utility-equivalent to its induced MAID • However, induced MAID (and BN) has large in-degree; exponentially larger representation size • For the other direction, any MAID can be efficiently encoded as a TAGG with same space complexity

  17. Overview • AGGs • TAGGs • Computation • Experiments

  18. Computing Expected Utility • Computing expected utility of a player, given a behavior strategy profile • An essential step in many game-theoretic computations • Can be cast as inference problem on the induced BN: compute expected values of • Can apply standard BN inference algorithm • TAGGs have additional structure; can be exploited to speedup computation

  19. Exploiting anonymity • Induced BN has large in-degrees for action-count variables • Their CPDs are counting functions with causal independence structure [Heckerman&Breese 1996] • Can reduce in-degree by transforming the BN • Create nodes representing intermediate counts

  20. Exploiting Temporal Structure • A network satisfies the Markov property if parents of variables at time t are at t or t-1. • Parts of the transformed BN (action counts) satisfy MP • Can transform it to one satisfying MP by duplicating variables • Adapt the interface algorithm for Dynamic BNs • interface: set of variables in time t that have children in time t+1 • d-separates “past” from “future” • algorithm eliminates variables in the temporal order; keeping distribution over interface variables

  21. Tollbooth example • interface at t: the action count variables at time t

  22. Computation, Ctd • Further exploiting the structure of transformed BN within the same time step • We can exploit CSI for further speedup • Our algorithm computes EU in poly time if • the number of interface variables at each time are bounded • inference over the chance variables can be done efficiently • Our methods an be applied to speedup computation of Nash equilibria

  23. Overview • AGGs • TAGGs • Computation • Experiments

  24. Experiments • Compute EU for tollbooth games (3 lanes) • Approach 1: standard clique tree algorithm on induced BN • Approach 2: same clique tree algorithm on transformed BN • Approach 3: our algorithm

  25. Experiments: Tollbooth (5 cars per time step, varying T)

  26. Experiments: Tollbooth(T=3, varying # cars per time step)

  27. Conclusions • Temporal Action-Graph Games (TAGGs) • novel compact representation for dynamic games • extends AGGs to dynamic setting • compactly express wider variety of utility structure, including CSI and anonymity • exploit such structure for efficient computation • expected utility • Nash equilibrium

  28. 2 3 0 3 4 3

  29. Game Theory In One Slide  • A game: • an interaction between two or more self-interested agents • each agent independently chooses a strategy • each agent derives utility from the resulting strategy profile • Strategies: • simultaneous-move games: choosing from a set of actions • dynamic games: choosing actions at multiple points in time; conditioned on observations • can randomize over actions • Best Response: • I play a strategy that maximizes my own utility, given a particular strategy profile for the other agents • Nash Equilibrium: • a strategy profile with the property that everyagent’s strategy is a best response to the strategies of the others

  30. Representations of Games • Goal: modeling real-world systems • Standard Representations • normal form for simultaneous-move games: specify utilities for every action profile • extensive form (game trees) for dynamic games: specify utilities for every possible sequence of actions • such representations ignore structure in the games’ utilities • sizes grow exponentially in parameters of the games • CompactRepresentations • simultaneous-move: Graphical Games [Kearns, Littman & Singh 2001], Action-Graph Games (AGGs) [Bhat & Leyton-Brown 2004][Jiang & Leyton-Brown 2006] • dynamic: Multi-Agent Influence Diagrams (MAIDs) [Koller & Milch 2001]

  31. Representing Dynamic Games • Multi-Agent Influence Diagrams(MAIDs): directed acyclic graph over • Chance nodes • Decision nodes: a player chooses an action, observing instantiation of parents • Utility nodes: utility for a player, function of instantiation of parents • MAIDs are compact when utility functions exhibit strict independencies • Does not capture other kinds of structure

  32. Temporal Action-Graph Games • TAGGs: compact representation for dynamic games • Extension of AGGs to dynamic setting • Can efficiently encode arbitrary MAIDs • Can compactly represent dynamic games with CSI and/or anonymity structure • Efficient computation

  33. TAGGs at a high level • Played over a series of time steps 1 … T • At each time step, an AGG is played by a set of players, on a set of action nodes • Action counts on the action nodes are accumulated over time • Model uncertainty using chance variables • Decisions: player at each decision may observe a set of previous decisions, chance variables and action counts • Utility functions: action-specific and time-specific

  34. TAGG: tollbooth example • N correspond to the cars • T=4 • One action node for each lane • At each time step, we have 5 decisions • Action set is the entire set • Payoff time same as decision time: pt(D) = {t(D)} • Utility function has A as its only parent • Size of representation

More Related