1 / 94

Influence Diagrams for Robust Decision Making in Multiagent Settings

Influence Diagrams for Robust Decision Making in Multiagent Settings. Prashant Doshi University of Georgia, USA. http://thinc.cs.uga.edu. Yifeng Zeng Reader, Teesside Univ. Previously: Assoc Prof., Aalborg Univ. Yingke Chen Post doctoral student. Muthu Chandrasekaran

idalee
Download Presentation

Influence Diagrams for Robust Decision Making in Multiagent Settings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Influence Diagrams for Robust Decision Making in Multiagent Settings

  2. Prashant Doshi University of Georgia, USA

  3. http://thinc.cs.uga.edu

  4. Yifeng Zeng Reader, Teesside Univ. Previously: Assoc Prof., Aalborg Univ. Yingke Chen Post doctoral student Muthu Chandrasekaran Doctoral student

  5. Influence diagram

  6. Ri Ai S ID for decision making where state may be partially observable Oi

  7. How do we generalize IDs to multiagent settings?

  8. Adversarial tiger problem

  9. Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Multiagent influence diagram (MAID) (Koller&Milch01) Rj MAIDs offer a richer representation for a game and may be transformed into a normal- or extensive-form game A strategy of an agent is an assignment of a decision rule to every decision node of that agent

  10. Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Rj A strategy profile is in Nash equilibrium if each agent’s strategy in the profile is optimal given others’ strategies Expected utility of a strategy profile to agent i is the sum of the expected utilities at each of i’s decision node

  11. Strategic relevance Consider two strategy profiles which differ in the decision rule at D’ only.A decision node, D, strategically relies on another, D’, if D‘s decision rule does not remain optimal in both profiles.

  12. Is there a way of finding all decision nodes that are strategically relevant to D using the graphical structure? Yes, s-reachability Analogous to d-separation for determining conditional independence in BNs

  13. Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Rj Evaluating whether a decision rule at D is optimal in a given strategy profile involves removing decision nodes that are not s-relevant to D and transforming the decision and utility nodes into chance nodes

  14. What if the agents are using differing models of the same game to make decisions, or are uncertain about the mental models others are using?

  15. Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Let agent i believe with probability, p, that j will listen and with 1-p that j will do the best response decision Analogously, j believes that i will open a door with probability q, otherwise play the best response Rj

  16. Top-level Open Listen q p Open Listen Network of ID (NID) Block O Block L (Gal&Pfeffer08) Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response

  17. Ri Growli Growlj Tiger loc Open or Listenj Open or Listeni Top-level Block -- MAID Rj Let agent i believe with probability, p, that j will likely listen and with 1- p that j will do the best response decision Analogously, j believes that i will mostly open a door with probability q, otherwise play the best response

  18. RTLi GrowlTLj GrowlTLi Mod[j;Di] Mod[i;Dj] BR[j]TL BR[i]TL Tiger locTL MAID representation for the NID Open or ListenTLi Open or ListenTLj RTLj OpenO ListenL

  19. MAIDs and NIDs Rich languages for games based on IDs that models problem structure by exploiting conditional independence

  20. MAIDs and NIDs Focus is on computing equilibrium, which does not allow for best response to a distribution of non-equilibrium behaviors Do not model dynamic games

  21. Generalize IDs to dynamic interactions in multiagent settings

  22. Challenge: Other agents could be updating beliefs and changing strategies

  23. Open or Listenj Mj,l-1 Ri Open or Listeni Tiger loci Growli Model node: Mj,l-1 models of agent j at level l-1 Policy link: dashed arrow Distribution over the other agent’s actions given its models Belief on Mj,l-1: Pr(Mj,l-1|s) Level l I-ID

  24. Members of the model node Different chance nodes are solutions of models mj,l-1 Mod[Mj] represents the different models of agent j Mj,l-1 Open or Listenj S Mod[Mj] mj,l-11 mj,l-11, mj,l-12 could be I-IDs , IDs or simple distributions Aj1 mj,l-12 Aj2

  25. CPT of the chance node Aj is a multiplexer Assumes the distribution of each of the action nodes (Aj1, Aj2) depending on the value of Mod[Mj] Mj,l-1 Aj S Mod[Mj] mj,l-11 Aj1 mj,l-12 Aj2

  26. Could I-IDs be extended over time? We must address the challenge

  27. Ri Ait+1 Ajt+1 St+1 Oit+1 Mj,l-1t+1 Ri Ait Ajt St Oit Mj,l-1t Model update link

  28. Interactive dynamic influence diagram (I-DID)

  29. How do we implement the model update link?

  30. Ajt+1 Mj,l-1t+1 Ajt Mj,l-1t Mod[Mjt+1] st mj,l-1t+1,1 Aj1 Mod[Mjt] mj,l-1t+1,2 Oj Aj2 mj,l-1t+1,3 mj,l-1t,1 Aj3 Aj1 Oj1 mj,l-1t+1,4 mj,l-1t,2 Aj4 Aj2 Oj2

  31. Ajt+1 Mj,l-1t+1 Ajt Mj,l-1t Mod[Mjt+1] st mj,l-1t+1,1 Aj1 Mod[Mjt] mj,l-1t+1,2 These models differ in their initial beliefs, each of which is the result of j updating its beliefs due to its actions and possible observations Oj Aj2 mj,l-1t+1,3 mj,l-1t,1 Aj3 Aj1 Oj1 mj,l-1t+1,4 mj,l-1t,2 Aj4 Aj2 Oj2

  32. Recap

  33. Prashant Doshi, Yifeng Zeng and Qiongyu Chen, “Graphical Models for Interactive POMDPs: Representations and Solutions”, Journal of AAMAS, 18(3):376-416, 2009 Daphne Koller and Brian Milch, “Multi-Agent Influence Diagrams for Representing and Solving Games”, Games and Economic Behavior, 45(1):181-221, 2003 Ya’akov Gal and AviPfeffer, “Networks of Influence Diagrams: A Formalism for Representing Agent’s Beliefs and Decision-Making Processes”,Journal of AI Research, 33:109-147, 2008

  34. How large is the behavioral model space?

  35. How large is the behavioral model space? General definition A mapping from the agent’s history of observations to its actions

  36. How large is the behavioral model space? 2H (Aj) Uncountably infinite

  37. How large is the behavioral model space? Let’s assume computable models Countable A very large portion of the model space is not computable!

  38. Daniel Dennett Philosopher and Cognitive Scientist Intentional stance Ascribe beliefs, preferences and intent to explain others’ actions (analogous to theory of mind - ToM)

  39. Organize the mental models Intentional models Subintentional models

  40. Organize the mental models Intentional models E.g., POMDP  =  bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs) BDI, ToM Subintentional models Frame (may give rise to recursive modeling)

  41. Organize the mental models Intentional models E.g., POMDP  =  bj, Aj, Tj, j, Oj, Rj, OCj (using DIDs) BDI, ToM Subintentional models E.g., (Aj), finite state controller, plan Frame

  42. Finite model space grows as the interaction progresses

  43. Growth in the model space Other agent may receive any one of |j| observations |Mj|  |Mj||j| |Mj||j|2  ...  |Mj||j|t 0 1 2 t

  44. Growth in the model space Exponential

  45. General model space is large and grows exponentially as the interaction progresses

  46. It would be great if we can compress this space! • No loss in value to the modeler • Flexible loss in value for greater compression Lossless Lossy

  47. Expansive usefulness of model space compression to many areas: • Sequential decision making in multiagent settings using I-DIDs • Bayesian plan recognition • Games of imperfect information

  48. General and domain-independent approach for compression Establish equivalence relations that partition the model space and retain representative models from each equivalence class

  49. Approach #1: Behavioral equivalence (Rathanasabapathy et al.06,Pynadath&Marsella07) • Intentional models whose complete solutions • are identical are considered equivalent

  50. Approach #1: Behavioral equivalence Behaviorally minimal set of models

More Related