Download presentation
Presentation is loading. Please wait.
Published byJocelyn Saffer Modified over 10 years ago
1
Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez
2
Introduction Behavior-based system Learning using positive and negative feedback Behaviors decide when is time to activate Distributed algorithm Test the concept in a robot
3
Motivation Behavior control is a weak point initial Behavior-based systems Behavior control has to be prewired This approach doesn’t scale too well
4
New Ideas Behavior control is learned through experience Learning algorithm completely distributed Each behavior learns when to become active The solution maximizes positive feedback and minimizes negative feedback
5
The Learning Task What is needed: Vector of binary perceptual conditions Set of behaviors Positive feedback generator Negative feedback generator
6
The Learning Task The task: Change the precondition list from each behavior to maximize relevance and reliability
7
The Learning Task Constraints: Relevance: behavior correlated to positive feedback, not correlated with negative feedback Reliability: behavior receives consistent feedback
8
The Learning Task More constraints: Algorithm should deal with noise, Perform in real time, Support readaptation
9
The Learning Task Assumptions: At least one combination of preconditions is bounded Feedback is immediate Only combinations of conditions can be learned
10
Algorithm Measure: Number of times a positive/negative feedback did/didn’t happen when a behavior was/wasn’t active Calculate the correlation between positive/negative feedback and the status of the behavior
11
Algorithm Measure: Express relevance and reliability in terms of this correlation Relevance controls whether a behavior should be active or not Reliability decides whether the behavior should try to improve itself
12
Algorithm Measure: Improvement is done by monitoring a perceptual condition If reliability increases, the behavior is added to the list of preconditions Keep monitoring in a circle until reaching the threshold
13
Genghis Six-legged robot that walks forward 12 behaviors, 6 conditions, 8742 nodes 4 eight-bit microprocessors, 32 KB memory The challenge is to learn how to coordinate the legs to produce a forward movement
14
Results Convergence time Non-intelligent search during the monitoring stage: 10 minutes Intelligent search: 1min 45sec A “tripod” gait emerged which is common among six-legged insects
15
Conclusions A learning algorithm was developed which allows a behavior-based robot to learn when its behaviors should become active using positive and negative feedback
16
Comments + Impressive results + Global behavior (walking) emerges from coordinated Behaviors + Simple idea, powerful consequences. Robot learned how to walk, wasn’t taught
17
Comments – Dead behaviors don’t revive. They might be useful in other situations ? How to deal with concurrent actions? (i.e. walking and following a target)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.