Multiagent Coordination and Cooperation: challenges and techniques

Multiagent Coordination and Cooperation: challenges and techniques Sarit Kraus Bar-Ilan, Israel UMD,USA

No Agent is an Island • Monitoring electricity networks (Jennings) • Distributed design and engineering (Petrie et al.) • Distributed meeting scheduling (Sen & Durfee) • Teams of robotic systems acting in hostile environments (Balch & Arkin, Tambe) • Collaborative Internet-agents (Etzioni & Weld, Weiss) • Collaborative interfaces (Grosz & Ortiz, Andre) • Information agent on the Internet (Klusch) • Cooperative transportation scheduling (Fischer) • Supporting hospital patient scheduling (Decker & Jin)

Design of automated agents to interact effectively • Coordinate: to act upon one another in harmony (necessary) • Cooperate: to work together (beneficial) • Example: driving in Tel-Aviv v.s. Driving in a convoy.

Teams and Individuals • Teams of agents that need to coordinate joint activities; problems: distributed information, distributed decision solving, local conflicts. • Self-motivated agents acting in the same environment; problems: need motivation to cooperate , conflict resolution, trust, distributed and hidden information.

Cooperation and Coordination by Others • Other entities coordinate their actions and cooperate in multi-entities environments: humans, animals, computers, particles. • Formal theories: game-theory, decision theory, physics, logic. • Non-formal theories: organizational theories, political science theories, “advisory” negotiation.

Using other disciplines’ results • No need to start from scratch! • Required modification and adjustment; AI gives insights and complimentary methods. • Is it worth it to use formal methods for multiagent Systems?

Negotiations in the Pollution Sharing Problem Collaborator: Esti Freitsis (forthcoming book “Strategic Negotiation in Multiagent Environments”, MIT Press)

Environment Description • There are some closely grouped plants in an industrial region. • Each plant can produce several types of products and. has a utility function (profit). • There are several types of pollutants. • Each plant has norms, restricting maximal emission of each pollutant that it emits. We refer to the situation when only these norms have to be carried out as usual circumstances.

Special circumstances • Sometimes there is a need to reduce pollution for some period because of external factors such as weather (high humidity, wind towards residential area). In this case plants receive new norms. We refer to this situation as special circumstances.

Current solution • Current solution: each plant reduce pollution according to the new norms. • Disadvantage: for one plant it is less costly to reduce one substance while for another it is less costly to reduce another substance.

Negotiations • Our solution: plants negotiateto reach beneficial agreements about the emission of what substances and by which percent each of them must be reduced. • The conflict solution: following the new norms. • First, we consider complete information situations.

Strategic Negotiation Model • Model of alternative offers (Rubinstein) which takes negotiation time into consideration: reduces negotiation time. • During the strategic-negotiations agents communicate their respective desires to reach mutually beneficial agreement. • The model provides a unified to many problems.

Structure of the Negotiation • There are N self motivated agent, randomly designated 1,2,... • All the agents negotiate to reach an agreement • The negotiation process may include several equidistant iterations 0,1,2… ־Timeand can continue forever. In each time period t, agent j(t) =t mod N makes an offer.

Structure of the Negotiation - cont. • The other agents respond simultaneously: YES4 or NO8 or OPTM. • If the offer was accepted4by all the agents:the last offer is implemented. • If at least one agent opts outM: a conflict occurs. • Otherwise (the offer was rejected8 by at least one agent), the negotiation proceeds to period t+1.

Negotiations Protocols • Simultaneous responses:an agentresponding to an offer is not informed of the other responses. • Sequential responses: an agent responding to an offer is informed of the responses of the preceding agents (assuming that the agents are ordered).

Equilibrium • Nash equilibrium:A strategy profile p is a Nash Equilibriumif no player has a different strategy yielding an outcome that he prefers to that generated when it chooses pi. • Subgame Perfect Equilibrium:If the strategy profile induced in every subgame is a Nash Equilibriumof this subgame.

Negotiations strategies for simultaneous responses • For each possible agreement x that is better to all the plants than the conflict solution there is a subgame-perfect equilibrium of the bargaining game, with the outcome x offered and unanimously accepted in period 0.

Choosing the Allocation • The owners of the plants can agree in advance on a joint technique for choosing x: • giving each server its conflict utility. • maximizing a social welfare criterion: • the sum of the servers’ utilities. • or the generalized Nash product of the servers’ utilities: P (Us(x)-Us(conflict)).

Negotiations strategies for sequential responses • Assumption: there is a time period, T where negotiation cannot continue anymore. In T the conflict allocation is implemented. • Perfect equilibrium by backward induction: • At T-1 if negotiations hasn’t ended, AT-1 suggests the best agreement to itself which is better to all agents than the conflict solution (denoted by OT-1 ); the other agents accept. • At T-2, AT-2 suggests the best agreement to itself which is better to all agents than the conflict solution and OT-1 (denoted by OT-2). The other agents accept. • By induction, at the first time period A0 O0 the others accept.

Assumptions about the environment • Profit is a linear function of the number of items of each product produced by the plant • Pollution is a linear function of the number of items of each product produced.

Techniques that were checked • Sequential response: backtracking • Simultaneous response: • Maximization of the sum with guaranties of default profit (MaxSum) • Maximization of the sum and Nash Products with side payments (MaxSumNash) • Simplex - method for linear optimization • Maximization of the Nash Product: • Praxis- method for multi-variable nonlinear function minimization. • Hill Climbing

Simulation Parameters • Number of plants is varied from 5 to 20. • Number of pollution types is varied from 5 to 20. For each product pollution of some type is produced with probability 1/2. • Each plant produces Max_prod different types of products. Max_prod is varied from 5 to 20. Pollution and profit per item of product and pollution constraints are set randomly. • Results: Average of 25 simulation runs.

Plants’ utility as the function of the number of plants

Plants’ utility as a function of the number of products

Plants’ utility as the function of the number of pollutants

Conclusions (Complete Information) • Simultaneous response: • If side payments are permitted the MaxSumNash method is the best. • If side payments are not permitted either BackTracking or MaxSum should be used. • Sequential response: BackTracking should be used. • Techniques: game theory, heuristic search, optimization methods

Incomplete Information • In real world situations the plants do not have complete information about each other’s utility function. • Solution: using economic theories for distributed mechanisms for reallocation of resource in “markets” with many agents and many divisible resources (Wellman 93).

General Equilibrium theory • The general-equilibrium theory studies how the market prices are determined by the actions of the individuals. • General equilibrium is obtained when a set of prices is found such that supply meets demand for each good and where the agents optimize their use of the goods at the current price levels.

General Equilibrium theory (Cont) • Assumption: each agent behaves competitively - it takes prices as given, independently of its actions. • Used for distributed mechanisms for resources allocation in environments with many agents and many divisible resources (Welman).

Tatonnement • It is a price-adjustment process (Wallras1954). • The tatonnement process starts with some arbitrary price vector p0. • The agents determine their demand at those prices and report the quantities demanded from an “auctioneer”. • The auctioneer repeatedly adjusts the prices, pt+1=pt+(quantity_demanded-quantity_available )

Tatonnement (Cont) • If the sequence p0,p1,... converges then the result is competitive equilibrium. • However, the tatonnement process does not converge to equilibrium in general. • Gross substitutability: if the price for one good rises, the demand for other goods does not decrease. • In the pollution allocation environment this condition does not hold.

Tatonnement (Cont) • Moreover, in our case the utility functions are the result of constrained optimization and therefore the aggregate demand function is not continuous • Thus, general equilibrium does not always exists!

Market Mechanisms • We propose three algorithms for finding suboptimal solution of the pollution allocation problem. • Tatonnement based mechanism: Competitive Equilibrium Market (CEM): the allocation of the pollutants is performed only after the process is terminated; very similar to WALRAS algorithm [Wellman].

Greedy market mechanisms • Market-Clearing with Intermediate Transactions (MCIT) • Market-Clearing Intermediate Exchange (MCIE) • A redistribution of the pollutants is done in each cycle of the mechanism. In MCIT a monetary transaction is performed after each cycle and in the MCIE exchange of two pollutants is done after each cycle.

The Three Market Mechanisms • In all the mechanisms, at the beginning of the process the plants are allowed to emit their default allocation. • In each cycle of the three mechanisms the auctioneer chooses one (or two in MCIE) of the pollutants randomly, and tries to determine its clearing price - the price at which demand is equal to supply, while keeping the prices of the other pollutants fixed. It uses binary search to find the clearing price.

Market Mechanisms (Cont) • The process is terminated when the prices do not change for a predefined number of iterations, or when it reaches the predefined maximal number of iterations. • The differences from the Tatonnement: • the procedure used to find the clearing prices • the division of the pollutants given the clearing prices • the maximization problem is solved by the plants when computing their demands.

The Influence of the Number of Plants on Plants’ Utilities

The Influence of the Number of Products per Plant on the Plants’ Utilities

The Influence of the number of pollutants on the Plants’ utility

Conclusions (Incomplete Information) • If side payments are permitted and the number of pollutants is small then MCIT method is the best. • If side payments are not permitted or the number of pollutants is large then the MCIE method is the best. • Techniques: economics, heuristic search, optimization methods, binary search. • Problem: will each plant behave competitively??

Motivating Example b: upgrade software on a network of workstations as part of a sys-admin group tomorrow from 6-8 p.m. g: go to theatre with friends tomorrow from 7-9 p.m. ??? Agent must reconcile intentions: • its intention to do the group task b • a potential intention to do g

Problem Description • Self-interested agents • committed to a collaborative activity • receive outside offers • They need to reconcile intentions, deciding between: • defaulting on their group-related commitment • rejecting the outside offer • Agents assess outcomes using utility functions. • How can agents be encouraged to consider the group’s good? • What utility functions should agents use?

SPIRE Simulation System(SharedPlans Intention Reconciliation Experiments) • Study the impact of: • group norms and policies • agent utility functions • environmental factors • Goal: provide insights that agent developers can use to develop collaboration-capable agents (Grosz, Sullivan, Das, Kraus)

Decision-theory Based Frameworks • Multi-attributed decision making:application: • Intentions reconciliation in SharedPlans • Benefits: using results of MADM, e.g., Specific method is not so important, standardization techniques. • Problems: choosing attributes; assigning values, choosing weights.

Game-theory Based Frameworks(Non-cooperative Models) • Strategic-negotiation modelbased on: alternating offers model of Rubinstein. Applications: Forthcoming book Kraus, 2001 MIT Press) • pollution allocation • Data allocation (Schwartz & kraus AAAI97), • Resource allocation , task distribution • hostage crisis (Kraus Wilkenfeld).

Advantages and Difficulties:Negotiation on Data Allocation • Beneficial results; proved to be better than current methods; simple strategies. • Problems: • Need to develop utility functions; • Finding possible action: identifying optimal allocations is NP complete; • Incomplete information: game-theory provides limited solutions.

Game-theory Based Frameworks(Non-cooperative Models) • Auctions applications: • Data allocation (Schwartz & Kraus ATAL97, ICMAS00), • Electronic commerce. • Subcontracting based on: principle agent models.Applications: • Task allocation (kraus, AIJ96).

Advantages and Difficulties:Auctions for Data Allocation • Beneficial results; proved to be better than current methods. • Problems: • Utility functions, • Difficult to find bidding when there is incomplete information and the evaluations are dependant on each other: no procedures; Need to combine with learning.

Game-theory Based Frameworks(Cooperative Models) • Coalition theoriesapplications: • Group and teams formation (shehory &kraus CI99). • Benefits: well-defined concepts of stability; mechanisms to divide benefits. • Difficulties: utility functions, no procedures for coalition formation; exponential problems. • DPS model: combinatory theories & operations research (shehory &kraus AIJ98).

Logical Models • Building agents on top of any software packages. • Logic is a basis for an agent programming language (Subrahmanian et al. Heterogeneous Agent Systems: Theory and Implementation, MIT Press, 2,000.) service layer message layer code P decision layer authorization layer per Wwrap

Multiagent Coordination and Cooperation: challenges and techniques