Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo.

Slides:



Advertisements
Similar presentations
Capture Point: A Step toward Humanoid Push Recovery
Advertisements

Gestures Recognition. Image acquisition Image acquisition at BBC R&D studios in London using eight different viewpoints. Sequence frame-by-frame segmentation.
G5BAIM Artificial Intelligence Methods
Kinematic Synthesis of Robotic Manipulators from Task Descriptions June 2003 By: Tarek Sobh, Daniel Toundykov.
NW Computational Intelligence Laboratory Experience-Based Surface-Discernment by a Quadruped Robot by Lars Holmstrom, Drew Toland, and George Lendaris.
Jenkins — Modular Perception and Control Brown Computer — ROUGH DRAFT ( ) 1 Workshop Introduction: Modular Perception.
1 st Chinese - German Summer School Software development for 4 legged robot soccer competition Zheng Qianyi, Robot and Intelligent System Lab, Tongji University.
Benjamin Stephens Carnegie Mellon University 9 th IEEE-RAS International Conference on Humanoid Robots December 8, 2009 Modeling and Control of Periodic.
Model Predictive Control for Humanoid Balance and Locomotion Benjamin Stephens Robotics Institute.
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
Zach Ramaekers Computer Science University of Nebraska at Omaha Advisor: Dr. Raj Dasgupta 1.
Sandra Wieser Alexander Spröwitz Auke Jan Ijspeert.
Behaviors for Compliant Robots Benjamin Stephens Christopher Atkeson We are developing models and controllers for human balance, which are evaluated on.
Presented By: Huy Nguyen Kevin Hufford
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
Computer Engineering Department INTRODUCTION TO ROBOTICS COE 484 Dr. Mayez Al-Mouhamed SPRING 2008 Chapter V – REFERENCE BEHAVIOR.
FLANN Fast Library for Approximate Nearest Neighbors
NUS CS5247 Motion Planning for Humanoid Robots Presented by: Li Yunzhen.
Locomotion in modular robots using the Roombots Modules Semester Project Sandra Wieser, Alexander Spröwitz, Auke Jan Ijspeert.
Web Enabled Robot Design and Dynamic Control Simulation Software Solutions From Task Points Description Tarek Sobh, Sarosh Patel and Bei Wang School of.
© 2003 The RoboCup Federation Progress and Research Results In Robot Soccer Professor Peter Stone Trustee, The RoboCup Federation Department of Computer.
Biped Robots. Definitions Static Walking Static Walking The centre of gravity of the robot is always within the area bounded by the feet that are touching.
Presented by: k. Ramya krishna
Definition of an Industrial Robot
BIPEDAL LOCOMOTION Antonio D'Angelo.
Automating the Lee Model. Major Components Simulator code –Verifying outputs –Verifying model equations –Graphical User interface Auto-tuning the model.
FYP FINAL PRESENTATION CT 26 Soccer Playing Humanoid Robot (ROPE IV)
© ABB AB, Corporate Research - 1 9/20/2015 Master Thesis Presentation André C. Bittencourt Friction Change Detection in Industrial Robot arms.
Adapting Simulated Behaviors For New Characters Jessica K. Hodgins and Nancy S. Pollard presentation by Barış Aksan.
NW Computational Intelligence Laboratory Implementing DHP in Software: Taking Control of the Pole-Cart System Lars Holmstrom.
BIPEDAL LOCOMOTION Prima Parte Antonio D'Angelo.
Spacetime Constraints Revisited Joe Marks J. Thomas Ngo Using genetic algorithms to find solutions to spacetime constraint problems in 2D.
T. Bajd, M. Mihelj, J. Lenarčič, A. Stanovnik, M. Munih, Robotics, Springer, 2010 ROBOT CONTROL T. Bajd and M. Mihelj.
What is Genetic Programming? Genetic programming is a model of programming which uses the ideas (and some of the terminology) of biological evolution to.
Optimization Problems - Optimization: In the real world, there are many problems (e.g. Traveling Salesman Problem, Playing Chess ) that have numerous possible.
Whitman and Atkeson.  Present a decoupled controller for a simulated three-dimensional biped.  Dynamics broke down into multiple subsystems that are.
NUS CS5247 Deadlock-Free and Collision-Free Coordination of Two Robot Manipulators By Patrick A. O’Donnell and Tomás Lozano-Pérez MIT Artificial Intelligence.
Muhammad Al-Nasser Mohammad Shahab Stochastic Optimization of Bipedal Walking using Gyro Feedback and Phase Resetting King Fahd University of Petroleum.
Centre for Mechanical Technology and Automation Institute of Electronics Engineering and Telematics  TEMA  IEETA  Simulation.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 11: Artificial Intelligence Computer Science: An Overview Tenth Edition.
Modeling & Planning Deniz Güven Needle Insertion.
1 University of Texas at Austin Machine Learning Group 图像与视频处理 计算机学院 Motion Detection and Estimation.
ZMP-BASED LOCOMOTION Robotics Course Lesson 22.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Institute for Computer Science VI Autonomous Intelligent Systems
Balance control of humanoid robot for Hurosot
Benjamin Stephens Carnegie Mellon University Monday June 29, 2009 The Linear Biped Model and Application to Humanoid Estimation and Control.
1 Motivation We wish to test different trajectories on the Stanford Test Track in order to gain insight into the effects of different trajectory parameters.
Written by Changhyun, SON Chapter 5. Introduction to Design Optimization - 1 PART II Design Optimization.
Optimal Path Planning Using the Minimum-Time Criterion by James Bobrow Guha Jayachandran April 29, 2002.
Spacetime Constraints Revisited J. Thomas Ngo Graduate Biophysics Program Harvard University Joe marks Cambridge Research Lab Digital Equipment Corporation.
1/23 Intelligent Agents Chapter 2 Modified by Vali Derhami.
Intelligent Agents. Outline Agents and environments Rationality PEAS (Performance measure, Environment, Actuators, Sensors) Environment types Agent types.
Robot Intelligence Technology Lab. 10. Complex Hardware Morphologies: Walking Machines Presented by In-Won Park
Chapter 4 Dynamic Analysis and Forces 4.1 INTRODUCTION In this chapters …….  The dynamics, related with accelerations, loads, masses and inertias. In.
ROBOTICS 01PEEQW Basilio Bona DAUIN – Politecnico di Torino.
Robot Intelligence Technology Lab. Evolutionary Robotics Chapter 3. How to Evolve Robots Chi-Ho Lee.
제 9 주. 응용 -4: Robotics Artificial Life and Real Robots R.A. Brooks, Proc. European Conference on Artificial Life, pp. 3~10, 1992 학습목표 시뮬레이션 로봇과 실제 로봇을.
University of Pisa Project work for Robotics Prof. Antonio Bicchi Students: Sergio Manca Paolo Viccione WALKING ROBOT.
Introduction to Machine Learning, its potential usage in network area,
Realization of Dynamic Walking of Biped Humanoid Robot
Chapter 11: Artificial Intelligence
Modelling Concepts Based on Chapter 5 Bennett, McRobb and Farmer
Adviser:Ming-Yuan Shieh Student:shun-te chuang SN:M
Intelligent Agents Chapter 2.
Hong Cheng SEG4560 Computational Intelligence for Decision Making Chapter 2: Intelligent Agents Hong Cheng
Market-based Dynamic Task Allocation in Mobile Surveillance Systems
Synthesizing Realistic Human Motion
Unsupervised Perceptual Rewards For Imitation Learning
Presentation transcript:

Using OpenRDK to learn walk parameters for the Humanoid Robot NAO A. Cherubini L. Iocchi it’s me F. Giannone M. Lombardo G. Oriolo

Overview: environment Robotic Agent NAO ApplicationRobotic Soccer SDK Simulator Humanoid Robot Produced by Aldebaran

Process raw data from environment Elaborate raw data to obtain more reliable information Decide the best behaviour to accomplish the agent goal Actuate robot motors accordindly Vision Module Modelling Module Motion Control Module Behaviour Control Module Environment At First !!! Overview: (sub)tasks

Make Nao walk…how? Main Advantage …and a Drawback  Based on an unknow Walk Model  Ready to Use (…to be tuned) Nao is equipped with a set of motion utilities including walk implementation a walk implementation that can be No flexibility at all!!!  called through an interface (NaoQi Motion Proxy)  partially customized by tuning some parameters For these reasons we decided to develop our walk model and to tune it using machine learnig tecniques

SPQR Walking library development workflow Develop the Walk model using Matlab Test the walk model on Webots simulator Design and Implement a C++ library for our RDK Soccer Agent on Webots simulator on real NAO robot Finally tune walk parameters (on webots simulator and on NAO) SPQR Walk Model Test our Walking RDK Agent SPQR Walking Library

A simple Walking RAgent for Nao 2 scenari devono esser possibili: 1. Ragent che gira su webots 2. Ragent che gira su nao Mostrare quanto hanno in comune Spiegare vantaggi nell’usare RDK (possibilita’ di sviluppare e testare la walking library ortogonalmente alla sviluppo del resto del codice a cui puo’ cmq essere facilemente integrata)

A simple walking RAgent for Nao Motion Control Module NaoQi Adaptor Simple Behaviour Module Switches between two states: walk - stand Smemy SPQR Walking Library NAO (NaoQi) Webots Client TCP channel WEBOTS uses

Choose a set of variable output: 3D coordinates of selected points of the robot Choose and parametrize the desired trajectories for these variables at each phase of the gait SPQR Walking Engine Model 21 degrees of freedom Velocity Commands (v,ω) v is linear velocity ω is angolar velocity We follow the “Static Walking Pattern”: Use a-priori definition of the desired trajectories defined by: NAO model characteristics No actuated trunk No dynamic model available

SPQR velocity commands Initial Half Step Rectilinear Walk Swing Stand Position Final Half Step Curvilinear Walk Swing Turn Step Behavior Control Module Motion Control Module Joints Matrix (v,ω) (0,ω) (0,0) (v,0) (v,ω) (v,0) (0,0) (v,ω)

SPQR walking subtasks and parameters SPQR walk subtasks Foot trajectories in the xz plane Center of mass trajectory in lateral direction Hip yaw/pitch control (turn) Arm control X tot, X sw0, X ds Z st, Z sw Y ft, Y ss, Y ds, K r H yp KsKs Biped walking Double support phaseSwing phase SS%

SPQR Walking Library Class Diagram

Walk tuning: main issues Possible choices  By hand  By using machine learning techniques Machine Learning seems the best solution  Less human interaction  Explores the search space in a more systematic way …but take care of some aspects  You need to define an effective fitness function  You need to choose the right algorithm to explore the parameter space  Only a limited amount of experiments can be done on a real robot

SPQR Learning System Architecture Learner Learning library RAgent Walking library uses Real Nao Webots Data to evaluate the fitness Fitness Iteration experiments (GPS)

SPQR Learner First iteration? Return initial Iteration and iteration information Apply the chosen algorithm (strategy) Yes No Policy Gradient (e.g., PGPR) Nelder Mead Simplex Method Genetic Algorithm Learner Return next Iteration and iteration information

Policy Gradient (PG) iteration Given a point p in the parameter space  IR K Generate n (n=mk) policies from p (for each component of p: p i, p i + , or p i -  ) Evaluate the policies For each k  {1, …, K}, compute F k+, F k0, F k- For each k  {1, …, K}, if F 0 > F + and F 0 > F - then  k =0 else  k = F + -F -  *=   normalized(  ) p’=p+  *

Enhancing PG: PGPR At each iteration i, the gradient estimate  (i) can be used to obtain a metric for measuring the relevance of the parameters. Given the relevance and a threshold T, PGPR prunes less relevant parameters in next iterations. forgetting factor

Curvilinear biped walking experiment The robot move along a curve with radius R for a time t Fitness function: In which: radial error path length

Simulators in learning tasks Advantages  You can test the gait model and the learning algorithm without being biased by noise Limits  The results of the experiments on the simulator can be ported on the real robot, but specialized solutions for the simulated model can be not so effective on the real robot (e.g., it does not take into account asymmetries, models are not very accurate)

Results (1) Five sessions of PG, 20 iterations each, all starting from the same initial configuration SS%, Ks, Yft have been set to hand-tuned values 16 policies for each iteration Fitness increases in a regular way Low variance among the five simulations

Results (2) Z sw XsKrX sw0 Five runs of PGPR Final parameter sets for the five PG runs

A. Cherubini, F. Giannone, L. Iocchi, M. Lombardo, G. Oriolo. “Policy Gradient Learning for a Humanoid Soccer Robot”. Accepted for Journal of Robotics and Autonomous Systems. A. Cherubini, F. Giannone, L. Iocchi, and P. F. Palamara, “An extended policy gradient algorithm for robot task learning”, Proc. of IEEE/RSJ International Conference on Intelligent Robots and System, A. Cherubini, F. Giannone, and L. Iocchi, “Layered learning for a soccer legged robot helped with a 3D simulator”, Proc. of 11th International Robocup Symposium, Bibliography

??? Any Questions ??? ???