Learning from Human Boyuan Chen.

Learning from Human Boyuan Chen

Outline Definition and Motivation Learning from Demonstration Pipeline
Paper Discussion Recent and Future Work

Definition Learning from Demonstration (LfD) is a paradigm for enabling robots to autonomously perform new tasks by learning from human teacher’s demonstration. Also known as: Robot Programming by Demonstration (PbD), Imitation Learning, and Apprenticeship Learning. Scholarpedia: Robot learning by demonstration

Learn complicated tasks easy and fast Programming experience? No need.
Motivation Learn complicated tasks easy and fast Programming experience? No need.

Tutorial on: Learning from Demonstration, ICRA 2016
Current Research Area Low-Level Skills: Trajectories (e.g Pick and Place); Force Control (e.g Tactile Sensing); High-Level Skills: Action Combinations; Speech-directed teaching Combined with other techniques: Reinforcement Learning Deep Reinforcement Learning Inverse Optimal Control User-studies to assess: Interfaces; Effectiveness of algorithms Tutorial on: Learning from Demonstration, ICRA 2016

Learning from Demonstration Pipeline
Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009): Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009):

Demonstration and Recording
Human Gesture Recognition Chao, Fei, et al. "A robot calligraphy system: From simple to complex writing by human gestures." Engineering Applications of Artificial Intelligence 59 (2017): 1-14. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Chao, Fei, et al. "A robot calligraphy system: From simple to complex writing by human gestures." Engineering Applications of Artificial Intelligence 59 (2017): 1-14.

Trajectory Awareness Lee, Alex X., et al. "Learning from multiple demonstrations using trajectory-aware non-rigid registration with applications to deformable object manipulation." Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015.

Key Point Vision Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014. Nair, Ashvin, et al. "Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.", 2017 Kinect

Keyframe Recording Akgun, Baris, et al. "Keyframe-based learning from demonstration." International Journal of Social Robotics 4.4 (2012):

And many other choices … Speech Interaction Teleoperation Control Tactile Sensors ...

Representation (Model) Generation
Need to consider: Dimension System Type Assumption ... Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording

Dynamical Systems Basic Concept: Mathematical Formalization “Fixed Rule” The time dependence of a point’s position in its ambient space Ordinary Differential Equations Dynamical Systems: Georgia Institute of Technology: CSE6740 Machine Learning: Non-autonomous DS: time dependent Autonomous DS: time independent

Machine Learning Approach Or: Probabilistic Approach Algorithms for Modelling: Kalman Filters Hidden Markov Model Input-Output Model Non-parametric Model Generative Perceptual Model Gaussian Mixture Model Georgia Institute of Technology: CSE6740 Machine Learning: Non parametric model: has parameters.

What about dimension? Low Dimension: Two point: (x_start, y_start), (x_goal, y_goal) Pose at fixed point Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014.

What about dimension? High Dimension High Dimensions Multi-Model Demonstration Yin, Hang, et al. "Associate Latent Encodings in Learning from Demonstrations." Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). No. EPFL-CONF

Policy Derivation and Learning
Three main methods: Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009): Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Mapping Function: Classification (Discontinuous); Regression (Continuous) System Model: Reinforcement Learning: Reward Function Plans: State Machine

Policy Derivation and Learning
Always with Policy Learning process Generalization Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording

Learning from Demonstration Pipeline
Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Human Gesture Recognition Trajectory Awareness Key Point Vision Keyframe Recording Many other interaction methods Dynamical System Machine Learning Approach Dimensions Mapping Function System Model Plans With Policy Learning Process Generalization

Learning from Demonstration Demo!
Tutorial on: Learning from Demonstration, Aude Billard, Klas Kronander, Jose Ramon Medina, ICRA, 2016 P.I. Corke, Robotics, Vision & Control, Springer 2011, ISBN Khansari Zadeh, S. M. and Billard, A., Learning Stable Non-Linear Dynamical Systems with Gaussian Mixture Models. IEEE Transaction on Robotics, vol. 27, num 5, p Kronander, K., Khansari Zadeh, S. M. and Billard, A. (2015) Incremental Motion Learning with Locally Modulated Dynamical Systems. Robotics and Autonomous Systems, 2015, vol. 70, iss. 1, pp Carl Edward Rasmussen, Gaussian Processes for Machine Learning, The MIT Press, ISBN X Medina, J. R., Lorenz, T., and Hirche, S. (2015). Synthesizing Anticipatory Haptic Assistance Considering Human Behavior Uncertainty. Robotics, IEEE Transactions on, 31(1),

Paper Discussion Demonstration and Recording Two Cameras:
Measure the amount in Receiver Measure the pose of Source Pros: Less error Cons: Simple Model Transparent Receiver Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014.

Paper Discussion Presentation (Model) Generation
Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014.

Paper Discussion Policy Derivation and Learning

Paper Discussion Policy Derivation and Learning
Score Function: 1.0, 0.5, 0.5 Skill Selection: Discrete Optimization Pouring Shaking A Shaking B Inside Skill Adaptation Continuous Optimization Shaking B: ɸ= 0 ~ π / 2 CMA-ES Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014.

Paper Discussion Planning Execution Generalization

Paper Discussion Paper Summary Pros:
Completed strategies for entire pouring problem Completed optimization architecture Clean and straightforward system, can inspire researchers a lot Few input, fast “learning” Cons: Low dimensional measurement system Only Camera Sensor Not intelligent enough: too many details in strategy (manual modelling) Not very good at action generalization

Paper Discussion Paper Summary Future work:
Multi-Sensors: tactile sensor feedback Integrate into more completed framework Pick and Place of container Cooking + more skills in skill set Rozo, Leonel, Pablo Jiménez, and Carme Torras. "Force-based robot learning of pouring skills using parametric hidden Markov models." Robot Motion and Control (RoMoCo), th Workshop on. IEEE, 2013. Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), th IEEE-RAS International Conference on. IEEE, 2014.

Recent and Future Work Multi-sensor and Multi-Models High dimensions
Unsupervised Learning Data-driven Approach: Deep Learning Deep Reinforcement Learning Yin, Hang, et al. "Associate Latent Encodings in Learning from Demonstrations." Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). No. EPFL-CONF Yamaguchi, Akihiko, and Christopher G. Atkeson. "Neural networks and differential dynamic programming for reinforcement learning problems." Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016. Agrawal, Pulkit, et al. "Learning to poke by poking: Experiential learning of intuitive physics." arXiv preprint arXiv: (2016). Nair, Ashvin, et al. "Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.", 2017

Thank you for listening!
Questions? Thank you for listening!

Learning from Human Boyuan Chen.

Similar presentations

Presentation on theme: "Learning from Human Boyuan Chen."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning from Human Boyuan Chen.

Similar presentations

Presentation on theme: "Learning from Human Boyuan Chen."— Presentation transcript:

Similar presentations

About project

Feedback