Learning from Human Boyuan Chen
Outline Definition and Motivation Learning from Demonstration Pipeline Paper Discussion Recent and Future Work
Definition Learning from Demonstration (LfD) is a paradigm for enabling robots to autonomously perform new tasks by learning from human teacher’s demonstration. Also known as: Robot Programming by Demonstration (PbD), Imitation Learning, and Apprenticeship Learning. Scholarpedia: Robot learning by demonstration http://www.scholarpedia.org/article/Robot_learning_by_demonstration
Learn complicated tasks easy and fast Programming experience? No need. Motivation Learn complicated tasks easy and fast Programming experience? No need.
Tutorial on: Learning from Demonstration, ICRA 2016 Current Research Area Low-Level Skills: Trajectories (e.g Pick and Place); Force Control (e.g Tactile Sensing); High-Level Skills: Action Combinations; Speech-directed teaching Combined with other techniques: Reinforcement Learning Deep Reinforcement Learning Inverse Optimal Control User-studies to assess: Interfaces; Effectiveness of algorithms Tutorial on: Learning from Demonstration, ICRA 2016
Learning from Demonstration Pipeline Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009): 469-483. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009): 469-483.
Demonstration and Recording Human Gesture Recognition Chao, Fei, et al. "A robot calligraphy system: From simple to complex writing by human gestures." Engineering Applications of Artificial Intelligence 59 (2017): 1-14. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Chao, Fei, et al. "A robot calligraphy system: From simple to complex writing by human gestures." Engineering Applications of Artificial Intelligence 59 (2017): 1-14.
Demonstration and Recording Trajectory Awareness Lee, Alex X., et al. "Learning from multiple demonstrations using trajectory-aware non-rigid registration with applications to deformable object manipulation." Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 2015.
Demonstration and Recording Key Point Vision Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014. Nair, Ashvin, et al. "Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.", 2017 Kinect
Demonstration and Recording Keyframe Recording Akgun, Baris, et al. "Keyframe-based learning from demonstration." International Journal of Social Robotics 4.4 (2012): 343-355.
Demonstration and Recording And many other choices … Speech Interaction Teleoperation Control Tactile Sensors ...
Representation (Model) Generation Need to consider: Dimension System Type Assumption ... Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording
Representation (Model) Generation Dynamical Systems Basic Concept: Mathematical Formalization “Fixed Rule” The time dependence of a point’s position in its ambient space Ordinary Differential Equations Dynamical Systems: https://en.wikipedia.org/wiki/Space Georgia Institute of Technology: CSE6740 Machine Learning: http://www.cc.gatech.edu/~lsong/teaching/CSE6740fall14/BBoots.pdf Non-autonomous DS: time dependent Autonomous DS: time independent
Representation (Model) Generation Machine Learning Approach Or: Probabilistic Approach Algorithms for Modelling: Kalman Filters Hidden Markov Model Input-Output Model Non-parametric Model Generative Perceptual Model Gaussian Mixture Model Georgia Institute of Technology: CSE6740 Machine Learning: http://www.cc.gatech.edu/~lsong/teaching/CSE6740fall14/BBoots.pdf Non parametric model: has parameters.
Representation (Model) Generation What about dimension? Low Dimension: Two point: (x_start, y_start), (x_goal, y_goal) Pose at fixed point Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Representation (Model) Generation What about dimension? High Dimension High Dimensions Multi-Model Demonstration Yin, Hang, et al. "Associate Latent Encodings in Learning from Demonstrations." Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). No. EPFL-CONF-224072. 2017.
Policy Derivation and Learning Three main methods: Argall, Brenna D., et al. "A survey of robot learning from demonstration." Robotics and autonomous systems 57.5 (2009): 469-483. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Mapping Function: Classification (Discontinuous); Regression (Continuous) System Model: Reinforcement Learning: Reward Function Plans: State Machine
Policy Derivation and Learning Always with Policy Learning process Generalization Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014. Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording
Learning from Demonstration Pipeline Planning Execution Policy Derivation and Learning Representation (Model) Generation Demonstration and Recording Human Gesture Recognition Trajectory Awareness Key Point Vision Keyframe Recording Many other interaction methods Dynamical System Machine Learning Approach Dimensions Mapping Function System Model Plans With Policy Learning Process Generalization
Learning from Demonstration Demo! Tutorial on: Learning from Demonstration, Aude Billard, Klas Kronander, Jose Ramon Medina, ICRA, 2016 P.I. Corke, Robotics, Vision & Control, Springer 2011, ISBN 978-3-642-20143-1 Khansari Zadeh, S. M. and Billard, A., Learning Stable Non-Linear Dynamical Systems with Gaussian Mixture Models. IEEE Transaction on Robotics, vol. 27, num 5, p. 943-957 Kronander, K., Khansari Zadeh, S. M. and Billard, A. (2015) Incremental Motion Learning with Locally Modulated Dynamical Systems. Robotics and Autonomous Systems, 2015, vol. 70, iss. 1, pp. 52-62. Carl Edward Rasmussen, Gaussian Processes for Machine Learning, The MIT Press, 2006. ISBN 0-262-18253-X Medina, J. R., Lorenz, T., and Hirche, S. (2015). Synthesizing Anticipatory Haptic Assistance Considering Human Behavior Uncertainty. Robotics, IEEE Transactions on, 31(1), 180-190.
Paper Discussion Demonstration and Recording Two Cameras: Measure the amount in Receiver Measure the pose of Source Pros: Less error Cons: Simple Model Transparent Receiver Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Presentation (Model) Generation Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Presentation (Model) Generation Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Policy Derivation and Learning Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Policy Derivation and Learning Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Policy Derivation and Learning Score Function: 1.0, 0.5, 0.5 Skill Selection: Discrete Optimization Pouring Shaking A Shaking B Inside Skill Adaptation Continuous Optimization Shaking B: ɸ= 0 ~ π / 2 CMA-ES Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Planning Execution Generalization Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Paper Discussion Paper Summary Pros: Completed strategies for entire pouring problem Completed optimization architecture Clean and straightforward system, can inspire researchers a lot Few input, fast “learning” Cons: Low dimensional measurement system Only Camera Sensor Not intelligent enough: too many details in strategy (manual modelling) Not very good at action generalization
Paper Discussion Paper Summary Future work: Multi-Sensors: tactile sensor feedback Integrate into more completed framework Pick and Place of container Cooking + more skills in skill set Rozo, Leonel, Pablo Jiménez, and Carme Torras. "Force-based robot learning of pouring skills using parametric hidden Markov models." Robot Motion and Control (RoMoCo), 2013 9th Workshop on. IEEE, 2013. Yamaguchi, Akihiko, et al. "Learning pouring skills from demonstration and practice." Humanoid Robots (Humanoids), 2014 14th IEEE-RAS International Conference on. IEEE, 2014.
Recent and Future Work Multi-sensor and Multi-Models High dimensions Unsupervised Learning Data-driven Approach: Deep Learning Deep Reinforcement Learning Yin, Hang, et al. "Associate Latent Encodings in Learning from Demonstrations." Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). No. EPFL-CONF-224072. 2017. Yamaguchi, Akihiko, and Christopher G. Atkeson. "Neural networks and differential dynamic programming for reinforcement learning problems." Robotics and Automation (ICRA), 2016 IEEE International Conference on. IEEE, 2016. Agrawal, Pulkit, et al. "Learning to poke by poking: Experiential learning of intuitive physics." arXiv preprint arXiv:1606.07419 (2016). Nair, Ashvin, et al. "Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation.", 2017
Thank you for listening! Questions? Thank you for listening!