390 likes | 470 Views
Explore methods to quantitatively measure diversity in robot groups for effective analysis and comparison, focusing on social entropy and behavioral differences. Discuss limitations and introduce hierarchic social entropy. Conclude with applications and limitations.
E N D
Quantizing Behavioral Heterogeneity Jon Beckham 11/21/02
Papers to Cover • “Measuring Robot Group Diversity”, Balch • “Design & Evaluation of Robust Behavior-Based Controllers”, Goldberg & Mataric • “Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning”, Zinkevich & Balch
Quantizing • “Measuring Robot Group Diversity”, Tucker Balch
Purpose • To suggest a standard way of quantitatively measuring diversity. • Allows for more accurate, effective analysis. • By establishing a standard metric, we can establish a baseline for comparison.
Sources • Simple Social Entropy • Adapted from Shannon’s Information Entropy • Behavioral Difference • Quantitative measure between different robots. • Hierarchic Social Entropy • Combination of the above.
Diversity • To quote Tucker, who quotes Webster… di verse adj 1: differing from one another: unlike. 2: composed of distinct or unlike elements or qualities.
The Discrete Approach • Assume robots are either alike or different; thus assume subsets of identical robots.
Simple Social Entropy • First, some notation: • R is a society of N agents, thus R = {r1, r2…rN} • C is a classification of R into M subsets • ci is an individual subset of C • Thus C = {c1,c2…cM} • pi is the proportion of agents in the ith subset. • Thus, the sum of all pi is 1.
Social Entropy’s Requirements • Continuous (H must be continuous in pi) • Monotonic (H must be monotonically increasing function of M) • Recursive (H must be weighted sum of H of subsets) • H = 0 when system is homogeneous • H is maximized when all pi are equal for given M • Any change to pi to approach greater equality increases H.
Thus… • H(X) = -K∑Mi=1 pilog2(pi) • REMEMBER THIS! • Also know that it’s the only equation to satisfy the first three properties (as proven by Shannon in his information entropy work).
Limitations of Simple Social Entropy • Loses data by munging pi and M into single value. • Only works for discrete systems.
What About C? • The classification into subsets… • Taxonomy • Clustering
More on Taxonomy • Classification at varying levels through a “dendrogram”.
Which Brings Us To Hierarchic Social Entropy • Simple Social Entropy is only a “snapshot” at a particular level of clustering. • To achieve a continuous metric, we use a plot of entropy at all taxonomic levels. • Good because it gives data at all clustering resolutions, putting to rest the clustering issue.
Another Formula • This time for hierarchic social entropy. • S(R) = ∫0∞ H(R,h)dh
Branching the Taxonomy? • How to get that pretty 2D mapping… • Evaluation Chamber? • In real world, this requires: • Fixed policies • Mechanically Homogeneous • Policy is reflected directly in overt behavior
Placing Numerical Value on Behavioral Differences • More notation • i is a robot’s perceptual state • a is the action (behavioral assemblage) selected by a robot’s control system based on the input i. • πj is rj‘s policy; a = πj(i) • pij is the number of times rj has encountered perceptual state I divided by the total number of times all states have been encountered
Simple Behavioral Difference Metric • Continuous • D’(ra,rb) = 1/n ∫ | πa(i) - πb(i) | di • Discrete • D’(ra,rb) = 1/n Σi | πa(i) - πb(i) | (1/n is normalization factor)
Behavioral Difference • Continuous • D’(ra,rb) = ∫ (pia + pib)/2 | πa(i) - πb(i) | di • Discrete • D’(ra,rb) = Σi (pia + pib)/2 | πa(i) - πb(i) |
Definitions • Absolutely behaviorally equivalent • Iff two robots select the same behavior in every perceptual state. • ε-equivalent if D(ra,rb) < ε. • ≡ε indicates ε-equivalence • A group of robots, R, is ε-homogeneous if for all ra,rb in R, ra≡ε rb.
Experiments (briefly) • Multiforaging • Behaviors • wander • stay_near_home • acquire_red • acquire_blue • deliver_red • deliver_blue • Perceptual Features • red_visible • blue_visible • red_visible_outside_homezone • blue_visible_outside_homezone • red_in_gripper • blue_in_gripper • close_to_homezone • close_to_red_bin • close_to_blue_bin
Methods • Local performance-based reinforcement • Global performance-based reinforcement • Local shaped reinforcement
Summary • Diversity is good in soccer, bad in simple foraging. • Diversity • Globally Rewarded, most diverse • Locally Rewarded • Shaped, least diverse
Conclusions • Diversity as an independent variable • Simple social entropy • Hierarchic social entropy
Problems? • Only deterministic policies • Analysis limited to behavioral diversity
Applying • “Design and Evaluation of Robust Behavior-Based Controllers”, Dani Goldberg and Maja J. Mataric
The Goal • To design multirobot controllers that: • Exhibit group-level robustness to robot failures and noise. • Are easily modified.
Focus • Simple Foraging
Controllers • One Homogeneous • Two Heterogeneous • Pack • Caste
Homogeneous Controller • Act concurrently and independently. • Behaviors • Avoiding • Wandering • Puck Detecting • Puck Grabbing • Homing • Boundary • Buffer • Creeping • Home Detector • Exiting • Reverse Homing • Heading
Heterogeneous Pack Controller • Uses temporal arbitration • SPST → SPDT • Dominance hierarchy based on capabilities or arbitrary assignment • Only one robot can deliver a puck at a time • Same controller as homogeneous, but uses ‘message passing’ to figure out which robot should deliver first. • Uses communication to determine failed or active.
Heterogeneous Caste Controller • Uses spatial arbitration • SPST → DPST • Robots are differentiated into sub-groups or castes • Act concurrently and independently, but in different regions of the task space • May have heterogeneous behavior in addition to spatial heterogeneity • No reliance on communication • (Not implemented, but communication could be use to balance caste ratios in case of failure.)
Interference Graphs • Homogeneous • Heterogeneous Pack • Heterogeneous Caste
Analysis Metrics • Inter-robot collisions • Distance traveled by each robot • Time-to-completion
Statistics… Goldberg & Mataric: “We have performed hypothesis tests using Student’s t, 1-factor analysis of variance (ANOVA), and 2-factor ANOVA, in order to verify that the differences between the results of the implementations were in fact statistically significant.” Tucker:
Conclusions • Attempted to apply Balch’s SSE and HSE, but because of vague definitions no clear conclusion could be reached. • Attempted several calculations, but no conclusive relation to performance. • Partly because no best controller.
Flaws • Use of communication in Pack controller, but nowhere else. • Allowed pack controller to keep track of state of other robots (working or non-working).