Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez.

Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez

Introduction Behavior-based system Learning using positive and negative feedback Behaviors decide when is time to activate Distributed algorithm Test the concept in a robot

Motivation Behavior control is a weak point initial Behavior-based systems Behavior control has to be prewired This approach doesn’t scale too well

New Ideas Behavior control is learned through experience Learning algorithm completely distributed Each behavior learns when to become active The solution maximizes positive feedback and minimizes negative feedback

The Learning Task What is needed: Vector of binary perceptual conditions Set of behaviors Positive feedback generator Negative feedback generator

The Learning Task The task: Change the precondition list from each behavior to maximize relevance and reliability

The Learning Task Constraints: Relevance: behavior correlated to positive feedback, not correlated with negative feedback Reliability: behavior receives consistent feedback

The Learning Task More constraints: Algorithm should deal with noise, Perform in real time, Support readaptation

The Learning Task Assumptions: At least one combination of preconditions is bounded Feedback is immediate Only combinations of conditions can be learned

Algorithm Measure: Number of times a positive/negative feedback did/didn’t happen when a behavior was/wasn’t active Calculate the correlation between positive/negative feedback and the status of the behavior

Algorithm Measure: Express relevance and reliability in terms of this correlation Relevance controls whether a behavior should be active or not Reliability decides whether the behavior should try to improve itself

Algorithm Measure: Improvement is done by monitoring a perceptual condition If reliability increases, the behavior is added to the list of preconditions Keep monitoring in a circle until reaching the threshold

Genghis Six-legged robot that walks forward 12 behaviors, 6 conditions, 8742 nodes 4 eight-bit microprocessors, 32 KB memory The challenge is to learn how to coordinate the legs to produce a forward movement

Results Convergence time Non-intelligent search during the monitoring stage: 10 minutes Intelligent search: 1min 45sec A “tripod” gait emerged which is common among six-legged insects

Conclusions A learning algorithm was developed which allows a behavior-based robot to learn when its behaviors should become active using positive and negative feedback

Comments + Impressive results + Global behavior (walking) emerges from coordinated Behaviors + Simple idea, powerful consequences. Robot learned how to walk, wasn’t taught

Comments – Dead behaviors don’t revive. They might be useful in other situations ? How to deal with concurrent actions? (i.e. walking and following a target)

Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez.

Similar presentations

Presentation on theme: "Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez.

Similar presentations

Presentation on theme: "Learning to Coordinate Behaviors Pattie Maes & Rodney A. Brooks Presented by: Javier Martinez."— Presentation transcript:

Similar presentations

About project

Feedback