Download presentation
Presentation is loading. Please wait.
Published byRichard Golden Modified over 9 years ago
1
Quantizing Behavioral Heterogeneity Jon Beckham 11/21/02
2
Papers to Cover “Measuring Robot Group Diversity”, Balch “Design & Evaluation of Robust Behavior- Based Controllers”, Goldberg & Mataric “Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning”, Zinkevich & Balch
3
Quantizing “Measuring Robot Group Diversity”, Tucker Balch
4
Purpose To suggest a standard way of quantitatively measuring diversity. Allows for more accurate, effective analysis. By establishing a standard metric, we can establish a baseline for comparison.
5
Sources Simple Social Entropy Adapted from Shannon’s Information Entropy Behavioral Difference Quantitative measure between different robots. Hierarchic Social Entropy Combination of the above.
6
Diversity To quote Tucker, who quotes Webster… di verse adj 1: differing from one another: unlike. 2: composed of distinct or unlike elements or qualities.
7
The Discrete Approach Assume robots are either alike or different; thus assume subsets of identical robots.
8
Simple Social Entropy First, some notation: R is a society of N agents, thus R = {r 1, r 2 …r N } C is a classification of R into M subsets c i is an individual subset of C Thus C = {c 1,c 2 …c M } p i is the proportion of agents in the ith subset. Thus, the sum of all p i is 1.
9
Social Entropy’s Requirements Continuous (H must be continuous in p i ) Monotonic (H must be monotonically increasing function of M) Recursive (H must be weighted sum of H of subsets) H = 0 when system is homogeneous H is maximized when all p i are equal for given M Any change to p i to approach greater equality increases H.
10
Thus… H(X) = -K∑ M i=1 p i log 2 (p i ) REMEMBER THIS! Also know that it’s the only equation to satisfy the first three properties (as proven by Shannon in his information entropy work).
11
Limitations of Simple Social Entropy Loses data by munging p i and M into single value. Only works for discrete systems.
12
What About C? The classification into subsets… Taxonomy Clustering
13
More on Taxonomy Classification at varying levels through a “dendrogram”.
14
Which Brings Us To Hierarchic Social Entropy Simple Social Entropy is only a “snapshot” at a particular level of clustering. To achieve a continuous metric, we use a plot of entropy at all taxonomic levels. Good because it gives data at all clustering resolutions, putting to rest the clustering issue.
15
Another Formula This time for hierarchic social entropy. S(R) = ∫ 0 ∞ H(R,h)dh
16
Branching the Taxonomy? How to get that pretty 2D mapping… Evaluation Chamber? In real world, this requires: Fixed policies Mechanically Homogeneous Policy is reflected directly in overt behavior
17
Placing Numerical Value on Behavioral Differences More notation i is a robot’s perceptual state a is the action (behavioral assemblage) selected by a robot’s control system based on the input i. π j is r j ‘s policy; a = π j (i) p i j is the number of times r j has encountered perceptual state I divided by the total number of times all states have been encountered
18
Simple Behavioral Difference Metric Continuous D’(r a,r b ) = 1/n ∫ | π a (i) - π b (i) | di Discrete D’(r a,r b ) = 1/n Σ i | π a (i) - π b (i) | (1/n is normalization factor)
19
Behavioral Difference Continuous D’(r a,r b ) = ∫ (p i a + p i b )/2 | π a (i) - π b (i) | di Discrete D’(r a,r b ) = Σ i (p i a + p i b )/2 | π a (i) - π b (i) |
20
Definitions Absolutely behaviorally equivalent Iff two robots select the same behavior in every perceptual state. ε-equivalent if D(r a,r b ) < ε. ≡ ε indicates ε-equivalence A group of robots, R, is ε-homogeneous if for all r a,r b in R, r a ≡ ε r b.
21
Experiments (briefly) Multiforaging Behaviors wander stay_near_home acquire_red acquire_blue deliver_red deliver_blue Perceptual Features red_visible blue_visible red_visible_outside_homezone blue_visible_outside_homezone red_in_gripper blue_in_gripper close_to_homezone close_to_red_bin close_to_blue_bin
22
Methods Local performance-based reinforcement Global performance-based reinforcement Local shaped reinforcement
23
Results
24
Summary Diversity is good in soccer, bad in simple foraging. Diversity Globally Rewarded, most diverse Locally Rewarded Shaped, least diverse
25
Conclusions Diversity as an independent variable Simple social entropy Hierarchic social entropy
26
Problems? Only deterministic policies Analysis limited to behavioral diversity
27
Applying “Design and Evaluation of Robust Behavior- Based Controllers”, Dani Goldberg and Maja J. Mataric
28
The Goal To design multirobot controllers that: Exhibit group-level robustness to robot failures and noise. Are easily modified.
29
Focus Simple Foraging
30
Controllers One Homogeneous Two Heterogeneous Pack Caste
31
Homogeneous Controller Act concurrently and independently. Behaviors Avoiding Wandering Puck Detecting Puck Grabbing Homing Boundary Buffer Creeping Home Detector Exiting Reverse Homing Heading
32
Heterogeneous Pack Controller Uses temporal arbitration SPST → SPDT Dominance hierarchy based on capabilities or arbitrary assignment Only one robot can deliver a puck at a time Same controller as homogeneous, but uses ‘message passing’ to figure out which robot should deliver first. Uses communication to determine failed or active.
33
Heterogeneous Caste Controller Uses spatial arbitration SPST → DPST Robots are differentiated into sub-groups or castes Act concurrently and independently, but in different regions of the task space May have heterogeneous behavior in addition to spatial heterogeneity No reliance on communication (Not implemented, but communication could be use to balance caste ratios in case of failure.)
34
Interference Graphs Homogeneous Heterogeneous Pack Heterogeneous Caste
35
Analysis Metrics Inter-robot collisions Distance traveled by each robot Time-to-completion
36
Statistics… Goldberg & Mataric: “We have performed hypothesis tests using Student’s t, 1-factor analysis of variance (ANOVA), and 2-factor ANOVA, in order to verify that the differences between the results of the implementations were in fact statistically significant.” Tucker:
37
Results
38
Conclusions Attempted to apply Balch’s SSE and HSE, but because of vague definitions no clear conclusion could be reached. Attempted several calculations, but no conclusive relation to performance. Partly because no best controller.
39
Flaws Use of communication in Pack controller, but nowhere else. Allowed pack controller to keep track of state of other robots (working or non-working).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.