Presentation is loading. Please wait.

Presentation is loading. Please wait.

The RoboCup Standard Platform League : Soccer for robots Requires fast, stable, intelligent robots Robots wear out and are time consuming to work with.

Similar presentations


Presentation on theme: "The RoboCup Standard Platform League : Soccer for robots Requires fast, stable, intelligent robots Robots wear out and are time consuming to work with."— Presentation transcript:

1

2 The RoboCup Standard Platform League : Soccer for robots Requires fast, stable, intelligent robots Robots wear out and are time consuming to work with Therefore, using machine learning

3  Explore robot : modified Walk.  Only walk forward and turn  Record joint commands and joint angles at each frame  Learn mapping from:  (Joint Angles, Joint Commands) to  Next Joint Angles RAE – Relative Absolute Error RRSE – Relative Root Squared Error

4  Many talking robots exist, but they are still very primitive  Actors for robot theatre, agents for advertisement, education and entertainment.  Designing inexpensive natural size humanoid caricature and realistic robot heads Machine Learning techniques used to teach robots behaviors, natural language dialogs and facial gestures. Dog.com from Japan

5  Robot activity as a mapping of the sensed environment and internal states to behaviors and new internal states (emotions, energy levels, etc). Words communicate only about 35 % of the information transmitted from a sender to a receiver in a human-to-human communication. The remaining information is included in para- language. Emotions, thoughts, decision and intentions of a speaker can be recognized earlier than they are verbalized.

6 Neck and upper body movement generation

7  Robot Vision: 1. Where is a face? (Face detection) 2. Who is this person (Face recognition, learning with supervisor, person’s name is given in the process. 3. Age and gender of the person. 4. Hand gestures. 5. Emotions expressed as facial gestures (smile, eye movements, etc) 6. Objects hold by the person 7. Lips reading for speech recognition. 8. Body language.

8  Speech recognition: 1. Who is this person (voice based speaker recognition, learning with supervisor, person’s name is given in the process.) 2. Isolated words recognition for word spotting. 3. Sentence recognition.  Sensors. 1. Temperature 2. Touch 3. movement

9  Facial and upper body gestures: 1. Face/neck gesticulation for interactive dialog. 2. Face/neck gesticulation for theatre plays. 3. Face/neck gesticulation for singing/dancing.  Hand gestures and manipulation. 1. Hand gesticulation for interactive dialog. 2. Hand gesticulation for theatre plays. 3. Hand gesticulation for singing/dancing.

10 1. Tracking the human. 2. Full gesticulation as a response to human behavior in dialogs and dancing/singing. 3. Modification of semi-autonomous behaviors such as breathing, eye blinking, mechanical hand withdrawals, speech acts as response to person’s behaviors. 4. Playing games with humans. 5. Body contact with human such as safe gesticulation close to human and hand shaking.

11  Supervised learning techniques  Reinforcement Learning and Adaptive Control  Multi-agent Learning  Autonomous Science

12 Happy state Ironic state Unhappy state “you are beautiful” / ”Thanks for a compliment” “you are blonde!” / ”I am not an idiot” P=1 P=0.3 “you are blonde!” / Do you suggest I am an idiot?” P=0.7

13 Who? What? Where? Speak ”Professor Perky”, blinks eyes twice Speak “In the classroom”, shakes head P=0.1 Speak “Was drinking wine” P=0.1 P=0.3 P=0.5 Speak ”Professor Perky” Speak ”Doctor Lee” Speak “in some location”, smiles broadly Speak “Was singing and dancing” P=0.5 P=0.1 …. P=0.1

14  The dialog/behavior has the following components:  (1) Eliza-like natural language dialogs based on pattern matching and limited parsing.  This is a “conversational” part of the robot brain, based on pattern-matching, parsing and black- board principles.  It is also a kind of “operating system” of the robot, which supervises other subroutines.

15  (2) Subroutines with logical data base and natural language parsing (CHAT).  This is the logical part of the brain used to find connections between places, timings and all kind of logical and relational reasoning.  (3) Use of generalization and analogy in dialog on many levels.  Use of Constructive Induction approach to help generalization, analogy reasoning and probabilistic generations in verbal and non-verbal dialog, like learning when to smile or turn the head off the partner.

16  (4) Model of the robot, model of the user, scenario of the situation, history of the dialog, all used in the conversation.  (5) Use of word spotting in speech recognition rather than single word or continuous speech recognition.  (6) Avoidance of “I do not know”, “I do not understand” answers from the robot.

17 Name (examples) Age (output) d SmileHeightHair Color Ahmed Kid (0) a(3)b(0)c(0) Aya Teenager (1) a(2)b(1)c(1) Rana Mid-age (2) a(1)b(2)c(2) AbdElRahman Old (3) a(0)b(3)c(3) Example “Age Recognition” Examples of data for learning, four people, given to the system

18 Smile - a Very often often moderately rarely Values 3210 Height - b Very Tall TallMiddleShort Values 3210 Color - c GreyBlackBrownBlonde Values 3210 Example “Age Recognition” Encoding of features, values of multiple-valued variables

19 ab\ c0123 00---- 01---3 02---- 03---- 10---- 11---- 12--2- 13---- 20---- 21-1-- 22---- 23---- 300--- 31---- 32---- 33---- d = F( a, b, c ) ab\ c0123 00---- 01---3 02---- 03---- 10---- 11---- 12--2- 13---- 20---- 21-1-- 22---- 23---- 300--- 31---- 32---- 33---- Groups show a simple induction from the Data

20 ab\ c0123 00---- 01---3 02---- 03---- 10---- 11---- 12--2- 13---- 20---- 21-1-- 22---- 23---- 300--- 31---- 32---- 33---- Groups show a simple induction from the Data Middle-age people smile moderately Teenagers smile often Children smile very often Grey hair blonde hair

21 Input variables Output variables

22 This kind of tables known from Rough Sets, Decision Trees, etc Data Mining

23 Decomposition is hierarchical At every step many decompositions exist Which decomposition is better? Original table First variant of decomposition Second variant

24  Man’s design versus robot’s design  The humanoid robot is versatile and adaptive, it takes its form from a human, a design well-verified by Nature.  Complete isomorphism of a humanoid robot with a human is very difficult to achieve (walking) and not even not entirely desired.  All what we need is to adapt the robot maximally to the needs of humans – elderly, disabled, children, entertainment.  Replicating human motor or sensor functionality are based on mechanistic methodologies,  but adaptations and upgrades are possible – for instance brain wave control or wheels

25 Don’t care Old one New and old one Rule 1 Rule 2Rule 3Rule 4 Rule 1 All four rules can be illustrated like that

26  Thinning algorithm is sensitive to corrupted image segments image Correct background shows desired shape of letter T Noise leads to lack of connectivity. BAD

27

28  Supervised (inductive) learning is the simplest and most studied type of learning  How can an agent learn behaviors when it doesn’t have a teacher to tell it how to perform?  The agent has a task to perform  It takes some actions in the world  At some later point, it gets feedback telling it how well it did on performing the task  The agent performs the same task over and over again  This problem is called reinforcement learning:  The agent gets positive reinforcement for tasks done well  The agent gets negative reinforcement for tasks done poorly

29  The goal is to get the agent to act in the world so as to maximize its rewards  The agent has to figure out what it did that made it get the reward/punishment  This is known as the credit assignment problem  Reinforcement learning approaches can be used to train computers to do many tasks  backgammon and chess playing  job shop scheduling  controlling robot limbs

30  Given:  a state space S  a set of actions a 1, …, a k  reward value at the end of each trial (may be positive or negative)  Output:  a mapping from states to actions example: Alvinn (driving agent) state: configuration of the car learn a steering action for each state

31 A policy  is a complete mapping from states to actions +1 2 3 1 4321

32  The agent knows what state it is in  The agent has a number of actions it can perform in each state.  Initially, it doesn't know the value of any of the states  If the outcome of performing an action at a state is deterministic, then the agent can update the utility value U() of states:  U(oldstate) = reward + U(newstate)  The agent learns the utility values of states as it works its way through the state space

33  Q-learning augments value iteration by maintaining an estimated utility value Q(s,a) for every action at every state  The utility of a state U(s), or Q(s), is simply the maximum Q value over all the possible actions at that state  Learns utilities of actions (not states)  model-free learning

34  foreach state s foreach action a Q(s,a)=0 s=currentstate do forever a = select an action do action a r = reward from doing a t = resulting state from doing a Q(s,a) = (1 –  ) Q(s,a) +  (r +  Q(t)) s = t  The learning coefficient, , determines how quickly our estimates are updated  Normally,  is set to a small positive constant less than 1

35 +1 START actions: UP, DOWN, LEFT, RIGHT UP 80% move UP 10%move LEFT 10%move RIGHT  reward +1 at [4,3], -1 at [4,2]  reward -0.04 for each step  what’s the strategy to achieve max reward?  what if the actions were deterministic?

36 +1 START actions: UP, DOWN, LEFT, RIGHT UP 80% move UP 10%move LEFT 10%move RIGHT reward +1 at [4,3], -1 at [4,2] reward -0.04 for each step  states  actions  rewards  what is the solution?

37 +1  only if actions deterministic  not in this case (actions are stochastic)  solution/policy  mapping from each state to an action

38 +1

39 +1

40 +1

41 +1

42 +1


Download ppt "The RoboCup Standard Platform League : Soccer for robots Requires fast, stable, intelligent robots Robots wear out and are time consuming to work with."

Similar presentations


Ads by Google