Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning.

Similar presentations


Presentation on theme: "Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning."— Presentation transcript:

1 Robotica Lezione 14

2 Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning  Case-based learning  Memory-based learning  Explanation-based learning

3 Q Learning Algorithm  Q(x,a)  Q(x,a) + b*(r + *E(y) - Q(x,a))‏  x is state, a is action  b is learning rate  r is reward  is discount factor (0,1)‏  E(y) is the utility of the state y, computed as E(y) = max(Q(y,a)) for all actions a  Guaranteed to converge to optimal, given infinite trials

4 Supervised Learning  Supervised learning requires the user to give the exact solution to the robot in the form of the error direction and magnitude.  The user must know the exact desired behavior for each situation.  Supervised learning involves training, which can be very slow; the user must supervise the system with numerous examples.

5 Neural Networks  NNs are the most widely used example of supervised learning  There are numerous types of NNs (feed- forward, recurrent) and NN learning algorithms  Hebbian learning: increase synaptic strength along pathways associated with stimulus and correct response  Perceptron learning: delta rule or back- propagation

6 Perceptron Learning  Algorithm:

7 NNs are RL  In all NNs, the goal is to minimize the error between the network output and the desired output  This is achieved by adjusting the weights on the network connections  Note: NNs are a form of reinforcement learning  NNs perform supervised RL with immediate error feedback

8 Classical Conditioning  Classical conditioning comes from psychology (Pavlov 1927)‏  Assumes that unconditioned stimuli (e.g., food) cause unconditioned responses (e.g., salivation); US => UR  A conditioned stimulus is, over time, associated with an unconditioned response (CS=>UR)‏  E.g., CS (bell ringing) => UR (salivation)‏

9 Classical Conditioning  Instead of encoding SR rules, conditioning can be used to from the associations automatically  Can be encoded in NNs

10 Connectionist AHC  Robuter learned a set of gain multipliers for exploration (Gachet et al)‏

11 Associative Learning  Learning new behaviors by associating sensors and actions into rules  E.g.,: 6-legged walking (Edinburgh U.)‏  Whisker sensors first, IR and light later  3 actions: left,right, ahead  User provided feedback (shaping)‏  Learned avoidance, pushing, wall following, light seeking

12 Two-Layer Perceptron Arch. 

13 More NN Examples  Some domains and tasks lend themselves very well to supervised NN learning  The best example is robot motion planning for articulation/manipulation  The answer to any given situation is well known, and can be trained  E.g., NNs are widely used for learning inverse kinematics

14 Evolutionary Methods  Genetic/evolutionary approaches are based on the evolutionary search metaphor  The states/situations and actions/behaviors are represented as "genes”  Different combinations are tried by various "individuals" in "populations"  Individuals with the highest "fitness" perform the best, are kept as survi vors

15 Evolutionary Methods  The others are discarded; this is the selection process.  The survivors' "genes" are mutated, crossed- over, and new individuals are so formed, which are then tested and scored.  In effect, the evolutionary process is searching through the space of solutions to find the one with the highest fitness.

16 Summary of Evolution  Evolutionary methods solve search and optimization problems using  a fitness function  operators  They represent the solution as a genetic encoding (string)‏  They select ‘best’ individuals for reproduction and apply  Cross over, mutation  They operate on populations

17 Genetic Algorithms 

18 Levels of Application  Genetic methods can be applied at different levels:  1) for tuning parameters (such as gains in a control system)‏  2) for developing controllers (policies) for individual robots  3) for developing group strategies for multi-robot systems (by testing groups as populations)

19 GAs v. GPs  When applied to strings of genes, the approaches are classified as genetic algorithms (GA)‏  When applied to pieces of executable programs, they approaches are classified as genetic programming (GP)‏  GP operates at a higher level of abstraction than GA

20 Classifier Systems  Use GAs to learn rule sets  ALECSYS - Autono-mouse  Learn behaviors and coordination

21 Evolving Structure & Behavior  Evolving morphology & behavior ( Sims 1994 ) using a physics-based model  Fitness in simulation through competition ( e.g., walking and swimming )‏

22 Co-evolving Brains and Bodies  Extension to the physical world ( Pollack )

23 Fuzzy Logic  Fuzzy control produces actions using a set of fuzzy rules based on fuzzy logic  This involves:  fuzzifying: mapping sensor readings into a set of fuzzy inputs  fuzzy rule base: a set of IF-THEN rules  fuzzy inference: maps fuzzy sets onto other fuzzy sets using membership fncts.  defuzzifying: mapping a set of fuzzy outputs onto a set of crisp output commands

24 Fuzzy Control  Fuzzy logic allows for specifying behaviors as fuzzy rules  Such behaviors can be smoothly blended together (e.g., Flakey, DAMN)‏  Fuzzy rules can be learned

25 Memory-Based Learning  Memory-based learning (MBL) involves storing the exact parameters of a situation/action.  This type of learning is important for systems with a great deal of parameters, where blind search through parameter space would take prohibitively long.  MBL takes a lot of space for data storage.

26 Memory-Based Learning  This is a trade-off against computation time; MBL assumes that it is best to use the space to remember than time to compute.  Content-addressable memory is used to approximate output for a new input  This is a form of interpolation  The approach uses regression (basic intuition is weighted averaging)‏

27 Symbolic Learning  Learning new rules given a set of predefined rules  by inference  by other logic techniques  Example: induction on decision trees

28 Example: Robo-SOAR  Logical reasoning system  Production system based on implications or rules  Chunking  Conflict resolution mechanism when different rules apply  Continuous conversion of deliberative info to efficient representations

29 Multistrategy Learning  Learning methods can be combined  Combining NNs and RL is common  use NNs to generalize states  then apply RL in a reduced state space  Combining NNs and MBL  use NNs to interpolate between instances  examples: ping pong, juggling, pole balance

30 Multistrategy Learning  Combining NNs and genetic algorithms is also useful  example: learning to walk ( next )‏  Neuro-fuzzy is a popular approach  example: USC AFV

31 Example: 6-legged walking  Each leg modeled as an oscillator  Phasing of legs controlled by a NN  Weights of NN learnt using a GA

32 Example: AVATAR  USC Robotics Labs

33 State of the Art  There are many learning techniques  Use dependant on application  Robotics is a particularly challenging domain for learning  very few successful examples  active area of research

34 Implemented Systems  Supervised learning has been used to learn controllers for complex, many DOF arms, as well as for navigating specific paths and performing articulated skills such as juggling and pole-balancing.  Memory-based learning has been used to learn controllers for very similar manipulator tasks as above.

35 Implemented Systems  Reinforcement learning has been used to learn controllers for individual navigating robots, groups of foraging robots, map- learning and maze-learning robots.  Evolutionary learning has been used to learn controllers for individual navigating robots, maze-solving robots, herding robots, and foraging robots.


Download ppt "Robotica Lezione 14. Lecture Outline  Neural networks  Classical conditioning  AHC with NNs  Genetic Algorithms  Classifier Systems  Fuzzy learning."

Similar presentations


Ads by Google