1 / 62

General Opponent* Modeling for Improving Agent-Human Interaction

This paper explores the challenges and techniques of general opponent modeling in agent-human interaction, with a focus on negotiation scenarios. It discusses the use of data collection, analysis, and prioritization in developing culture-sensitive agents and highlights the benefits of modeling opponent behavior for improving agent bargaining and decision-making. The study presents the KBAgent as an example and compares its performance to other agents, demonstrating the effectiveness of general opponent modeling in agent negotiations.

brinkley
Download Presentation

General Opponent* Modeling for Improving Agent-Human Interaction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. General Opponent* Modeling for Improving Agent-Human Interaction Sarit Kraus Dept. of Computer Science Bar Ilan University AMEC May 2010

  2. Motivation Negotiation is an extremely important form of people interaction

  3. Computers interacting with people Computer has the control Computer persuades human Human has the control

  4. 4

  5. Culture sensitive agents The development of standardized agent to be used in the collection of data for studies on culture and negotiation Buyer/Seller agents negotiate well across cultures • PURB agent

  6. Semi-autonomous cars

  7. Medical applications Gertner Institute for Epidemiology and Health Policy Research

  8. Automated care-taker The physiotherapist has no other available appointments this week. How about resting before the appointment? I scheduled an appointment for you at the physiotherapist this afternoon I will be too tired in the afternoon!!! Try to reschedule and fail

  9. Security applications • Collect • Update • Analyze • Prioritize

  10. People often follow suboptimal decision strategies • Irrationalities attributed to • sensitivity to context • lack of knowledge of own preferences • the effects of complexity • the interplay between emotion and cognition • the problem of self control • bounded rationality in the bullet General opponent* modeling

  11. Challenges of human opponent* modeling • Small number of examples • difficult to collect data on people • Noisy data • people are inconsistent (the same person may act differently) • people are diverse

  12. Agenda • Multi-attribute multi-round bargaining • KBAgent • Revelation + bargaining • SIGAL • Optimization problems • AAT based learning • Coordination with people: • Focal point based learning

  13. Played at least as well as people QOAgent[LIN08] Is it possible to improve the QOAgent? • Multi-issue, multi-attribute, with incomplete information • Domain independent • Implemented several tactics and heuristics • qualitative in nature • Non-deterministic behavior, also via means of randomization Yes, if you have data R. Lin, S. Kraus, J. Wilkenfeld, and J. Barry. Negotiating with bounded rational agents in environments with incomplete information using an automated agent. Artificial Intelligence, 172(6-7):823–851, 2008 13

  14. KBAgent [OS09] • Multi-issue, multi-attribute, with incomplete information • Domain independent • Implemented several tactics and heuristics • qualitative in nature • Non-deterministic behavior, also via means of randomization • Using data from previous interactions Y. Oshrat, R. Lin, and S. Kraus. Facing the challenge of human-agent negotiations via effective general opponent modeling. In AAMAS, 2009 14

  15. Example scenario Employer and job candidate Objective: reach an agreement over hiring terms after successful interview 15

  16. General opponent modeling • Challenge: sparse data of past negotiation sessions of people negotiation • Technique: Kernel Density Estimation 16

  17. Estimate likelihood of other party: accept an offer make an offer its expected average utility The estimation is done separately for each possible agent type: The type of a negotiator is determined using a simple Bayes' classifier Use estimation for decision making General opponent modeling 17

  18. KBAgent as the job candidate Best result: 20,000, Project manager, With leased car; 20% pension funds, fast promotion, 8 hours Human KBAgent 20,000 Team Manager With leased car Pension: 20% Slow promotion 9 hours 12,000 Programmer Without leased car Pension: 10% Fast promotion 10 hours 20,000 Project manager Without leased car Pension: 20% Slow promotion 9 hours 18

  19. KBAgent as the job candidate Best agreement: 20,000, Project manager, With leased car; 20% pension funds, fast promotion, 8 hours 20,000 Team Manager With leased car Pension: 20% Slow promotion 9 hours 12,000 Programmer Without leased car Pension: 10% Fast promotion 10 hours Human KBAgent  Round 7 20,000 Programmer With leased car Pension: 10% Slow promotion 9 hours 19

  20. Experiments 172 grad and undergrad students in Computer Science People were told they may be playing a computer agent or a person. Scenarios: Employer-Employee Tobacco Convention: England vs. Zimbabwe Learned from 20 games of human-human 20

  21. Results: Comparing KBAgent to others 21

  22. Main results In comparison to the QOAgent The KBAgentachieved higher utility values than QOAgent More agreements were accepted by people The sum of utility values (social welfare) were higher when the KBAgent was involved The KBAgent achieved significantly higher utility values than people Results demonstrate the proficiency negotiation done by the KBAgent General opponent modeling improves agent negotiations General opponent* modeling improves agent bargaining 22

  23. Automated care-taker I arrange for you to go to the physiotherapist in the afternoon I will be too tired in the afternoon!!! How can I convince him? What argument should I give?

  24. How should I convince him to provide me with information? Security applications

  25. Should I tell him that we are running out of antibiotics? Argumentation Should I tell her that my leg hurts? Should I tell him that I will lose a project if I don’t hire today? Which information to reveal? Should I tell him I was fired from my last job? Build a game that combines information revelation and bargaining 25

  26. Color Trails (CT) • An infrastructure for agent design, implementation and evaluation for open environments • Designed with Barbara Grosz (AAMAS 2004) • Implemented by Harvard team and BIU team 26

  27. An experimental test-ted • Interesting for people to play • analogous to task settings; • vivid representation of strategy space (not just a list of outcomes). • Possible for computers to play • Can vary in complexity • repeated vs. one-shot setting; • availability of information; • communication protocol. 27

  28. Game description • The game is built from phases: • Revelation phase • First proposal phase • Counter-proposal phase Joint work with Kobi Gal and Noam Peled

  29. Two boards 29

  30. Why not equilibrium agents? • Results from the social sciences suggest people do not follow equilibrium strategies: • Equilibrium based agents played against people failed. • People rarely design agents to follow equilibrium strategies (Sarne et al AAMAS 2008). • Equilibrium strategies are usually not cooperative – all lose. 30

  31. Perfect Equilibrium agent • Solved using Backward induction; no strategic signaling • Phase two: • Second proposer: Find the most beneficial proposal while the responder benefit remains positive. • Second responder: Accepts any proposal which gives it a positive benefit. 31

  32. Perfect Equilibrium agent • Phase one: • First proposer: propose the opponent’s counter-proposal • First responder: Accepts any proposals which gives it the same or higher benefit from its counter-proposal. • In both boards, the PE with goal revelation yields lower or equal expected utility than non-revelation PE • Revelation: Reveals in half of the games 32

  33. Asymmetric game

  34. 140 students Performance

  35. Benefits diversity • Average proposed benefit to players from first and second rounds

  36. Revelation affect • The effect of revelation on performance: Only 35% of the games played by humans included revelation • Revelation had a significant effect on human performance but not on agent performance • People were deterred by the strategic machine-generated proposals, which heavily depended on the role of the proposer and the responder. 36

  37. SIGAL agent Agent based on general opponent modeling: Genetic algorithm Logistic Regression

  38. SIGAL Agent: Acceptance • Learns from previous games • Predict the acceptance probability for each proposal using Logistic regression • Features (for both players) relating to proposals: • Benefit. • Goal revelations. • Players types • Benefit difference between rounds 2 and 1.

  39. SIGAL Agent: counter proposals • Model the way humans make counter-proposals

  40. SIGAL Agent • Maximizes expected benefit given any state in the game • Round • Player revelation • Behavior in round 1

  41. Agent strategies comparison 41

  42. SIGAL agent: performance

  43. Agents performance comparison General opponent* modeling improves agent negotiations Equilibrium Agent SIGAL Agent 43

  44. General opponent* modeling in Maximization problems 44

  45. AAT agent Agent based on general* opponent modeling Decision Tree/ Naïve Byes AAT 45

  46. Aspiration Adaptation Theory (AAT) Economic theory of people’s behavior (Selten) No utility function exists for decisions (!) Relative decisions used instead Retreat and urgency used for goal variables Avi Rosenfeld and Sarit Kraus. Modeling Agents through Bounded Rationality Theories. Proc. of IJCAI 2009., JAAMAS, 2010.

  47. Commodity search 1000 47

  48. Commodity search 900 1000

  49. Commodity search 900 1000 950 If price < 800 buy; otherwise visit 5 stores and buy in the cheapest. 49

  50. Behavioral models used in General opponent* modeling is beneficial Results 50

More Related