Çetin Meriçli Artificial Intelligence Laboratory Department of Computer Engineering

Developing a Novel Robust Multi-agent Task Allocation Algorithm for Four-legged Robotic Soccer Domain Çetin Meriçli Artificial Intelligence Laboratory Department of Computer Engineering Bogazici University 22.04.2005

Outline • Multi Robot Systems • Robot Soccer Domain • Problem definition • What was the aim • What we have done so far • Three-layered approach

Multi Robot Systems • Advantages • Robustness (due to redundancy) • Efficiency (parallel working of robots quickens the operation) • Synergy: The total is greater than sum of its parts • Free from single-point failure (if system is decentralized) • Disadvantages • High communication requirement in general • If greedy assignment methods are used, convergence to a suboptimal goal is possible

Robot Soccer Domain • Robot soccer is a very appropriate test bed for multi-robot systems because • Environment is highly dynamic • Environment is highly complex • Real-time response is required • Challenges in robot soccer • Sensors of robots are limited • Efficient legged locomotion is hard to implement • Processing information should be done in real-time (>15 fps) • Environment is partially observable

What was the aim • Our aim is to come up with a task allocation algorithm for legged robot soccer domain • Since environment is partially observable and erroneous, algorithm should be robust to noisy inputs and agent failures (either software crash or penalization) • In order to assign tasks, we have to define these tasks first so we need to decompose the overall goal which is to win the game • Roles are defined for specifying a player in certain circumstances • Aim is to propose an algorithm which assigns the roles to appropriate players and then assign tasks to appropriate roles in appropriate order. • We need a method for evaluating the game and players in order to assign tasks and roles. • All we can measure about the game are, • The positions of players and ball at a time step • Time left • current score • We should build objective functions for having a quantitative goodness of fit information about players for roles and roles for tasks

What we have done • We defined a set of metrics • A total of 200 games were played against four opponents • Each game is divided into episodes starting with kickoff and ending with either a score or end of half/game. • 81 negative and 1016 positive episodes were recorded. • If an episode ends with for score, it is considered as a positive example, if it ends with against score, it is considered as a negative example, and if it ends without a score, it is ignored.

What we have done (cont’d) • Following metrics are then calculated for each of the episodes: • Convex Hull Metrics • Area of the convex hull of own/opp players • Center of the convex hull of own/opp players • Density of the convex hull of own/opp players • Level of seperation Metrics: • Seperation of ball • Seperation of own goal • Seperation of opp goal • Team level Entropy Metrics: • Entropy in the vicinity of ball • Entropy in the vicinity of own goal • Entropy in the vicinity of opp goal

Metrics involving Convex Hulls of teams • Convex Hull of a team is the smallest polygon in which all the players in the team falls within. • We propose three metrics involving with the convex hulls of both own and opponent team • Density of Convex Hull • Area of Convex Hull • Coordinates and motion vector of the center of convex hull • If the ball is in our half of the field, the goalkeeper will be used in calculation, otherwise, it will be excluded.

Metrics involving Convex Hulls of teams (cont’d)

Density of Convex Hull • Expressed as the sum of distances of the team members on the edges of convex hull to the ball, if the ball falls within the convex hull. • Smaller the value, greater the density. Greater the density, higher the chance of getting the control of the ball, and hence, the opportunity to score a goal.

Area of Convex Hull • Measures the chance of getting the control of the ball • Greater the area, higher the chance of ball to fall within the convex hull, hence, higher chance to get the control of the ball • Contrast to density. Density decreases as the area increases

Pairwise Separation • Aims to measure the separability of ball and own goal from the opponent team • Defined as • Object can be Ballor Goal

Pairwise Separation Yellow line separates players denoted with green arrows from the ball

Entropies • Unlike to information entropy, we define the entropy in the vicinity of interested object for the team level as • In words, the difference between total number of own players and total number of opponent players divided by total number of players in the vicinity • Result is a number in the interval [-1,1] where, • -1 denotes the all players in examined area are belongs to opponent team • 1 denotes the reverse • 0 denotes that the number of own and opponent players are equal (entropy is maximum) • Vicinity definitions are important.

Entropies (cont’d) • Entropy in the vicinity of ball: • Measures the chance of a team of getting the control of the ball. • A positive number denotes the own team has a higher chance. • Zero denotes that both teams have equal chance

Entropies (cont’d) • Entropy in the vicinity of own goal: • Measures the chance of defending ball against attacking opponent players. • A positive number denotes the own team has a higher chance to defend our goal. • Zero denotes that both teams have equal chance

Entropies (cont’d) • Entropy in the vicinity of opponent goal: • Measures the chance of scoring the ball against defending opponent players. • A positive number denotes the own team has a higher chance to score. • Zero denotes that both teams have equal chance

Problems in interpretation of metrics • Since soccer is a very dynamic environment, environment changes rapidly and thus causes irrelevant ripples on calculated metrics. • In order to avoid this effect, each metric is smoothed with 4253h, twice.

4253h, Twice smoothing • The smoother 4253H, twice consists of a running median of 4, then 2, then 5, then 3, followed by Hanning. • Hanning is a running weighted average with weights ¼, ½, ¼. The word twice means that this operation is performed two times.

Example smoothing Smoothed Original

Problems • Original idea was to test consistency of a metric under similar conditions and if a metric has same trends in same game situations, labeling that metric as an informative metric. • we can say a metric is consistent and informative if the trends of that metric in same situations are positively correlated. • Main problem: How to detect portion of interest in an episode for performing correlation tests on metric values in similar situations? • No clear answer because high level abstract strategy in games spread over time and cannot be measured by probing a single time-step • Sometimes metrics are fooled due to improper actions (like scoring to own goal)

Three layered architecture • Three layered decomposition of team goals • Game level decomposition • Play level decomposition • Low level decomposition

Game Level Decomposition • Deals with the strategy of the whole game like aggressive attack, ruthless defense or keep the score. • At this level, long term metrics like ball possession percentage, game score and time left are measured and appropriate team-level game strategy is selected

Play Level Decomposition • a Play is a set of behavioral patterns consisting of primitive actions goto, dribble, pass and shoot ( difference between pass and shoot is the speed of the ball ). • This level deals with performing proper behaviors or plays in order to achieve overall game strategy.

Low Level Decomposition • Also called as Reflexive Level • Deals with the one-timestep measurements like reachability, costs for roles, etc.. • Calculates one-timestep metrics and hence forms the objective functions for role assignment. • The metrics are distance and orientations of players to the ball and to goals and distance of ball to the goals

Defined Roles • Goalie • Passive Defender • Supportive Defender • Active Defender • Primary Attacker • Secondary Attacker • Supportive Attacker

Defined Roles (cont’d) • Passive Defender: this is the former primary defender. the player which both nearest to the line between ball and own goal and near to own goal is assigned to this role • Active Defender: the aim of this role is to challenge with the opponent player possessing the ball and try to get the control of the ball. the player which both nearest to the line between ball and nearest to the ball is assigned to this role

Defined Roles (cont’d) • Supportive Defender: the aim of this role is to estimate the possible opp. player that the ball owning opponent player would pass the ball and hence, it tries to intercept the ball. • If the remaining player is in own half of the field, it is directly assigned to this role. if the remaining player is in the opponent half and it has a high chance of score ( to evaluate the chance, it is assumed that the player has the ball and faced towards the opponent goal. then the clearance and reachability for ball and opponent goal is evaluated ) It is left in potential attacking position for a possible counter-attack case.

Defined Roles (cont’d) • Primary Attacker: No need to explain. This is the player with the ball. It tries to dribble the ball to a position suitable for shooting, shoots the ball or passes the ball to a teammate at a better location. • Secondary Attacker: Tries to get a position where an unsuccessful shoot / pass attempt would causes the ball to move to. • Supportive Attacker: Tries to get a position in a way that increases the chance of keeping the control of the ball if primary attacker loses the ball due to an encounter with opponent team, due to an unsuccessful action (dribble/pass/shoot)

Phases of the Game • Offensive phase • Gaining the ball possession • How can we measure whether we have the possession of ball or not? • If none of the teammates broadcasts a "i have the ball" message for a certain time-steps, it is considered that the team has lost the ball. • Building up the play: • This is the phase that we try to achieve desired metric values • Some sort of deliberative approach may be suitable since building up the play depends on a pattern of behaviors and spread over time. • Probably we need to act against greedy metrics such as distance of ball to the opp. goal for some time to achieve desired values in higher level metrics. • Final Touch / Shoot

Role Assignment • Contrary to both our past implementations and others in literature, assignment of roles will not be in assigned/not assigned fashion as in the classical set theory but rather a degree of assignment will be used as in the fuzzy sets. • For example a player located in the middle of its own half of the field may be assigned to supportive defender role with 0.4 and to midfielder role with 0.6. • Final action of the player in that time-step will be determined with some sort of aggregation of outputs of the assigned roles. • If one or more of the resultant actions are special actions (kicks) instead of moving, the role with greatest membership value greater than 0.5 is performed.

Behavioral Level • Deals with behaviors consisting of primitive actions and player roles. • Used metrics are • isReachable( Position , Position ) • hasBall( playerRole ) • These metrics are used as predicates when formalizing the behavioral pattern. • Caution! The term behavior is a more abstract than the usual behaviors like search ball or performKick. Rather, this mean of behavior covers a set of low level behaviors (for example, dribbling ball to a certain region employs a set of search ball, track ball and moveWithBall low level behaviors)

Tactical Regions on the Field

Game Level Decomposition • At this level, overall game strategy is considered such as 3-5-2, aggressive attack, ruthless defense, keep the score etc. • Metrics for this level are computed by using values of lower level metrics over time.

Proposed Metrics for Game Level Decomposition • If the game strategy is aggressive attack, then the number of consecutive time-steps that the ball is in the opponent half of the field should be maximized • If the game strategy is ruthless defense (Canakkale Gecilmez), then the number of consecutive time-steps that the ball is in the own half of the field should be minimized • If the game strategy is keep the score, then the number of consecutive time-steps that the ball is in the own half should be less than the number of consecutive time-steps that the ball is in opponent half.

A metric for measuring the field possession over time • Field is divided into grids • At each time-step a cell is occupied by a player, the value of the cell is modified (+2 for own player, -2 for opponent player) • If a cell is empty, its value is approached to 0 by 1. • A cell may have a value in interval [-k, k] • Average of values of all cells will be in the interval [-k, k] again and gives us a measure of field possession. • Also, partial possession for special regions ( goal areas, for example ) can be calculated.

Remaining Work • Generating metric data for new approach • Writing a team using new algorithm • Verification of new metrics • Testing on TeamBots against other teams • Testing on Webots (if possible) with AIBOs • Limited test on real AIBOs

Questions?

Çetin Meriçli Artificial Intelligence Laboratory Department of Computer Engineering