Influence Diagrams for Robust Decision Making in Multiagent Settings

Influence Diagrams for Robust Decision Making in Multiagent Settings

Prashant Doshi University of Georgia, USA

http://thinc.cs.uga.edu

Yifeng Zeng Reader, Teesside Univ. Previously: Assoc Prof., Aalborg Univ. Yingke Chen Post doctoral student Muthu Chandrasekaran Doctoral student

Influence diagram

Ri Ai S ID for decision making where state may be partially observable Oi

How do we generalize IDs to multiagent settings?

Adversarial tiger problem

Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Multiagent influence diagram (MAID) (Koller&Milch01) Rj MAIDs offer a richer representation for a game and may be transformed into a normal- or extensive-form game A strategy of an agent is an assignment of a decision rule to every decision node of that agent

Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Rj A strategy profile is in Nash equilibrium if each agent’s strategy in the profile is optimal given others’ strategies Expected utility of a strategy profile to agent i is the sum of the expected utilities at each of i’s decision node

Strategic relevance Consider two strategy profiles which differ in the decision rule at D’ only.A decision node, D, strategically relies on another, D’, if D‘s decision rule does not remain optimal in both profiles.

Is there a way of finding all decision nodes that are strategically relevant to D using the graphical structure? Yes, s-reachability Analogous to d-separation for determining conditional independence in BNs

Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Rj Evaluating whether a decision rule at D is optimal in a given strategy profile involves removing decision nodes that are not s-relevant to D and transforming the decision and utility nodes into chance nodes

What if the agents are using differing models of the same game to make decisions, or are uncertain about the mental models others are using?

Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Let agent i believe with probability, p, that j will listen and with 1-p that j will do the best response decision Analogously, j believes that i will open a door with probability q, otherwise play the best response Rj

Top-level Open Listen q p Open Listen Network of ID (NID) Block O Block L (Gal&Pfeffer08) Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response

Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Top-level Block -- MAID Rj Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response

RTLi GrowlTLj GrowlTLi Mod[j;Di] Mod[i;Dj] BR[j]TL BR[i]TL Tiger locTL MAID representation for the NID Open or ListenTLi Open or ListenTLj RTLj OpenO ListenL

MAIDs and NIDs Rich languages for games based on IDs that models problem structure by exploiting conditional independence

MAIDs and NIDs Focus is on computing equilibrium, which does not allow for best response to a distribution of non-equilibrium behaviors Do not model dynamic games

Generalize IDs to dynamic interactions in multiagent settings

Challenge: Other agents could be updating beliefs and changing strategies

Open or Listenj Mj,l-1 Ri Open or Listeni Tiger loci Growli Model node: Mj,l-1 models of agent j at level l-1 Policy link: dashed arrow Distribution over the other agent’s actions given its models Belief on Mj,l-1: Pr(Mj,l-1|s) Level l I-ID

Members of the model node Different chance nodes are solutions of models mj,l-1 Mod[Mj] represents the different models of agent j Mj,l-1 Open or Listenj S Mod[Mj] mj,l-11 mj,l-11, mj,l-12 could be I-IDs , IDs or simple distributions Aj1 mj,l-12 Aj2

CPT of the chance node Aj is a multiplexer Assumes the distribution of each of the action nodes (Aj1, Aj2) depending on the value of Mod[Mj] Mj,l-1 Aj S Mod[Mj] mj,l-11 Aj1 mj,l-12 Aj2

Could I-IDs be extended over time? We must address the challenge

Ri Ait+1 Ajt+1 St+1 Oit+1 Mj,l-1t+1 Ri Ait Ajt St Oit Mj,l-1t Model update link

Interactive dynamic influence diagram (I-DID)

How do we implement the model update link?

Ajt+1 Mj,l-1t+1 Ajt Mj,l-1t Mod[Mjt+1] st mj,l-1t+1,1 Aj1 Mod[Mjt] mj,l-1t+1,2 Oj Aj2 mj,l-1t+1,3 mj,l-1t,1 Aj3 Aj1 Oj1 mj,l-1t+1,4 mj,l-1t,2 Aj4 Aj2 Oj2

Ajt+1 Mj,l-1t+1 Ajt Mj,l-1t Mod[Mjt+1] st mj,l-1t+1,1 Aj1 Mod[Mjt] mj,l-1t+1,2 These models differ in their initial beliefs, each of which is the result of j updating its beliefs due to its actions and possible observations Oj Aj2 mj,l-1t+1,3 mj,l-1t,1 Aj3 Aj1 Oj1 mj,l-1t+1,4 mj,l-1t,2 Aj4 Aj2 Oj2

Recap

Prashant Doshi, Yifeng Zeng and Qiongyu Chen, “Graphical Models for Interactive POMDPs: Representations and Solutions”, Journal of AAMAS, 18(3):376-416, 2009 Daphne Koller and Brian Milch, “Multi-Agent Influence Diagrams for Representing and Solving Games”, Games and Economic Behavior, 45(1):181-221, 2003 Ya’akov Gal and AviPfeffer, “Networks of Influence Diagrams: A Formalism for Representing Agent’s Beliefs and Decision-Making Processes”,Journal of AI Research, 33:109-147, 2008

How large is the behavioral model space?

How large is the behavioral model space? General definition A mapping from the agent’s history of observations to its actions

How large is the behavioral model space? 2H (Aj) Uncountably infinite

How large is the behavioral model space? Let’s assume computable models Countable A very large portion of the model space is not computable!

Daniel Dennett Philosopher and Cognitive Scientist Intentional stance Ascribe beliefs, preferences and intent to explain others’ actions (analogous to theory of mind - ToM)

Organize the mental models Intentional models Subintentional models

Organize the mental models Intentional models E.g., POMDP  =  bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs) BDI, ToM Subintentional models Frame (may give rise to recursive modeling)

Organize the mental models Intentional models E.g., POMDP  =  bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs) BDI, ToM Subintentional models E.g., (Aj), finite state controller, plan Frame

Finite model space grows as the interaction progresses

Growth in the model space Other agent may receive any one of |j| observations |Mj|  |Mj||j| |Mj||j|2  ...  |Mj||j|t 0 1 2 t

Growth in the model space Exponential

General model space is large and grows exponentially as the interaction progresses

It would be great if we can compress this space! • No loss in value to the modeler • Flexible loss in value for greater compression Lossless Lossy

Expansive usefulness of model space compression to many areas: • Sequential decision making in multiagent settings using I-DIDs • Bayesian plan recognition • Games of imperfect information

General and domain-independent approach for compression Establish equivalence relations that partition the model space and retain representative models from each equivalence class

Approach #1: Behavioral equivalence (Rathanasabapathy et al.06,Pynadath&Marsella07) • Intentional models whose complete solutions • are identical are considered equivalent

Approach #1: Behavioral equivalence Behaviorally minimal set of models

Influence Diagrams for Robust Decision Making in Multiagent Settings

Influence Diagrams for Robust Decision Making in Multiagent Settings

Presentation Transcript

Decision Diagrams

Influence Diagrams Basic Decision Trees

Solving Influence Diagrams Plus more Decision Trees

Binary Decision Diagrams

Influence Diagrams, Decision Trees and Probability

Chapter 2 – Influence Diagrams

Models of Coordination in Multiagent Decision Making

Challenging myths: Empathic decision making in usual clinical settings

Robust decision making in uncertain environments

Influence in MultiAgent Systems Application to Coalitions

Binary Decision Diagrams

The Right Care Programme â€“ Robust Decision Making

Multiagent rational decision making: searching and learning for “good” strategies

GaTAC: A Scalable and Realistic Testbed for Multiagent Decision Making

Solving Influence Diagrams

Chapter 6 Decision Trees and Influence Diagrams

Influence Line Diagrams-I

Control and Decision Making in Uncertain Multiagent Hierarchical Systems

Binary Decision Diagrams

Constructing influence diagrams

Influence Diagrams, Decision Trees and Probability