Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins, Monica Nicolescu, Evan Drumwright, and Chi-Wei Chu University.

Similar presentations


Presentation on theme: "1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins, Monica Nicolescu, Evan Drumwright, and Chi-Wei Chu University."— Presentation transcript:

1 1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins, Monica Nicolescu, Evan Drumwright, and Chi-Wei Chu University of Southern California Interaction Lab / Robotics Research Lab Center for Robotics and Embedded Systems http://robotics.usc.edu/~agents/Mars/mars.html

2 1/24 Motivation & Approach Goals: Natural human-robot interaction in various domains. Robot programming & learning by imitation; (mobile & humanoid). General Approach: Use intrinsic behavior repertoire to facilitate control, human-robot interaction, and learning. Use human interactive training method. Use human data-driven training methods.

3 1/24 Philosophy: Modularity & Interaction Complex control is represented as a combination of lower- dimensionality, composable building blocks: behaviors. These are the abstraction for interaction. Representation is deictic & action-embedded. Perception is classification, learning is refinement, enhancement & composition of building blocks. Interaction is action-embedded: humans know the robot’s behavior repertoire robots map human input onto the repertoire, for prediction & learning Human-robot communication is action-based. Intervention is treated as high-priority perceptual input.

4 1/24 Learning From People Learn from humans in two “natural” modes: Human teacher/trainer demonstrates a skill to a robot, which learns from one or a few (in single digits) trials, by mapping observations to existing behaviors. Corpus of human data is provided off-line, statistical learning is used to derive new behaviors (humanoid).

5 1/241/44 Previous Developments A Hierarchical Abstract Behavior Architecture Representation & execution of complex, sequential, hierarchically structured tasks An algorithm for on-line learning of task representations from experienced demonstrations Validated the architecture and learning algorithm Execution of tasks with hierarchical structure and long behavioral sequences Learning of complex tasks from both human and robot teachers

6 1/241/44 Environment sensory input M. N. Nicolescu, M. J Matarić, “A hierarchical architecture for behavior-based robots", International Conference of Autonomous Agents and Multiagent Systems, July 15- July 19, 2002. Hierarchical Abstract Behavior Architecture Extended behavior-based architecture Flexible activation conditions (dependency links between behaviors) allow for behavior reusability Representation of tasks as (hierarchical) behavior networks Sequential & opportunistic execution Support for automated generation (task learning)

7 1/241/44 Learning from Experienced Demonstrations Goal: Learn a high-level task representation in terms of the robot’s own skills Approach: Teacher-following strategy, active participation in the demonstration The robot is equipped with a set of basic skills The teacher is aware of these skills and also about what observations the robot can gather Mapping between what the robot sees and what it can perform The status (met/not met) of all behavior’s goals is continuously monitored: Teacher may signal moments of time relevant to the task goals met  behavior fire  observation-behavior mapping M. N. Nicolescu, M. J Mataric, "Experience-based representation construction: learning from human and robot teachers", IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29 - November 3, 2001

8 1/241/44 Recent Developments Motivation Current approach leads to a correct but possibly overspecialized task representation Problem in changing environments Approach Refine the learned task representations through: Generalizing over multiple (but few) demonstrations New demonstrations are “incorporated” into the existing task representation Providing feedback during task execution Unnecessary/missing parts of the task

9 1/241/44 Generalization Problem Hard to learn a task from only one trial: Limited sensing capabilities, quality of teacher’s demonstration, particularities of the environment Similar to inferring a regular expression (FSA equivalent) from examples Small number of demonstrations desired Statistical techniques not applicable Main learning inaccuracies: Learning irrelevant steps (false positives) Omission of steps that are relevant (false negatives) A C B F A A B F C A B F A Training examples Generalization ?

10 1/241/44 Generalization Approach Demonstrate the same task in different/similar environments Construct a task representation that: Encodes the specifics of each given example Captures the common parts between all demonstrations Compute a measure of similarity (common steps) between different examples The longest common sequence (LCS) between the topological representations O(nm) Merge the common nodes M. N. Nicolescu, M. J Matarić, ”Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice", Second International Joint Conference on Autonomous Agents and Multi-Agent Systems, July 14-18, 2003

11 1/241/44 Illustration of Approach A C B F A A B F C A C B F A A B F C A B F A B F A C B F AC Generalized network Longest common sequence  3 33 22 2 A 33 3 22 11 F 22 22 2 11 B 2 11 11 11 C 11 11 11 1 A CFBA     X Y

12 1/241/44 A C B F AC A B F A The dynamic programming method computes the LCS at each level (depth) in the graph The LCS is computed only for the different parts of the paths The LCS table is kept as a linked list of arrays The longest of the paths is selected  merge the common nodes new exampleexisting graph Merging Additional Demonstrations For subsequent examples  compute the LCS between the new example and all possible paths in the graph

13 1/241/44 A C B F AC A AC + A (AC + A)B (AC + A)BF Behavior Network Execution Computing preconditions for each behavior is similar to computing the regular expression from a FSA representation Added capability for disjunctive & conjunctive activation conditions Computing the types of dependencies between behaviors (ordering, enabling, permanent) from the two merged behavior networks

14 1/241/44 Generalization Task: Go to either the Green or Light Green targets, pick up the Orange box, go to the Yellow and Red targets, go to the Pink target, drop the box there, come back to the Light Green target None of the demonstrations corresponds to the desired task Contain incorrect steps and inconsistencies

15 1/241/44 Generalization Experiments First demonstration Robot performance Learned topologyEnvironment  All observations relevant  No trajectory learning  Not reactive policy

16 1/241/44 Generalization Experiments (II) 3rd Human demonstration Robot performance 3 rd 2 nd 1 st

17 1/241/44 Refining Task Representation Through Feedback Feedback given through speech Unnecessary task steps (“bad”) – remove steps from network Missing task steps (”new”  ”continue”) – add new steps to the network A B AC Delete unnecessary steps A B AC M N Include newly demonstrated steps A C B F AC BAD  NEW  M N   CONTINUE A B AC M. N. Nicolescu, M. J Matarić, ”Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice", Second International Joint Conference on Autonomous Agents and Multi-Agent Systems, July 14-18, 2003

18 1/241/44 Practice and Feedback Experiments 3 rd demonstration Practice run & feedback Robot performance Topology refinement

19 1/241/44 Practice and Feedback Experiments(II) Practice run & feedback 1 st demonstration Robot performance Practice run Topology refinement

20 1/241/44 Summary Generalization method incorporates multiple demonstrations into a unique behavior network representation Helps detect relevant/irrelevant observations Simple feedback cues can be used for: Providing instructive demonstrations Refining the task representations learned from direct demonstration or generalization

21 1/24 Learning from Motion Data Goal: To automatically derive both primitive and high-level behaviors from human motion data. Use behaviors as a substrate for generating robot motion and predicting/classifying human activity. Method: Corpus of human motion data (motion capture). Dimensional reduction to extract behaviors.

22 1/24 Motion Segmentation Extract short motion sequences. Previous methods: Manual (slow, tedious). Z-function (discrete motion only). !!!!Modified by Peters for Robonaut data New method: kinematic centroid: Assume limbs are pendulums. Greedy method determines “end” of pendulum swing Appropriate for highly dynamic motion exhibiting large swings !!!! O. C. Jenkins, M. J Matarić, “Automated Modularization of Human Motion into Actions and Behaviors", USC Center for Robotics and Embedded Systems Technical Report No. CRES-02-002.

23 1/24 Previous Work Earlier efforts involved: Application of PCA dimension reduction to arm motion data. K-means clustering to uncover primitive behaviors. Limitations: Linear PCA applied to nonlinear motion data. PCA does not capture temporal dependencies. Clustering for PCA decomposes space, but: Primitives have no intuitive meaning or theme. Difficult to compose these primitives into higher-level behaviors.

24 1/24 Current Work Use Isomap for non-linear DR [Tenenbaum et al 2000]. Extend Isomap to handle temporal dependencies. Cluster separable motion groups with bounding-box clustering. O. C. Jenkins, M. J Matarić, “Deriving Action and Behavior Primitives from Human Motion Data", IEEE International Conference on Intelligent Robots and Systems, September 30- October 4, 2002. Interpolate within cluster to represent new motion. Use further DR iterations to derive high-level behaviors.

25 1/24 A A A B B B C C F F D D D E E E C F A Example of two high-level behaviors... A B CD E F X Y PCA, spatial Isomap A B C/F D E Spatio-temporal Isomap A B CD E F tA Spatio-temporal Isomap, iter 2 XY tA performing this sequence: An example (???)

26 1/24 Spatio-temporal Dimension Reduction Isomap extracts underlying (non-linear) structure of data. E.g. 2D spherical manifold from 3D position data. Extend Isomap for temporal data using common temporal neighbors (CTN): CTN observes that sequence B is preceded by sequence A and followed by sequence C (resolves spatially similar sequences)

27 1/24 High-level Behaviors Extract high-level behaviours by applying spatio- temporal isomap again to sequence of primitives.

28 1/24 6 118 17 33 39 Arm Waving Punching Derived High-level Behaviors

29 1/24 Primitive Motion Synthesis Use interpolation between motion sequences to generate new variations. Interpolation provides a form of parameterization for a primitive. Trajectories of hand positions produced by interpolation. Blue/Red are motions grouped into a primitive. Black/magenta are motions are new motion variations.

30 1/24 High-level Motion Synthesis A behavior can be used to synthesize a variation on the input motion. Synthesis uses segment concatentation PunchingArm Waving Dancing: “Cabbage Patch”

31 1/24 Primitives as Forward Models Through eager evaluation, the span of motion variations can be realized for each primitive Consequently, a nonlinear forward model can be produced for each primitive Used for motion synthesis, given initial posture Experimenting with motion classification via Kalman gains PCA-view of primitive flow field in joint angle space

32 1/24 Summary: Learning from Motion Data Strengths of current approach Derives suitable behaviors for nonlinear motion data with temporal dependencies. Segmentation techniques allow for full automation. New variations on derived behaviors can be synthesized. Flow field forward models can be produced for each primitive Primitive forward models allow for smooth motion synthesis and motion classification Future work: Validation on better motion data (always). Derivation of primitives from NASA Robonaut motion. Integration with task-directed control mechanisms. Posture-atomic primitive derivation.

33 1/24 Humanoid control via parameterized trajectories ● Free-space control of humanoid robots ● Set of exemplar trajectories ● represent Cartesian extrema of single behavior ● trajectories are in joint-space ● New movements produced via interpolation ● representative of the original behavior ● selected by mixing parameter

34 1/24 What's good about this? ● Very few exemplars of a behavior may be needed to model that behavior ● For dextrous robotic control, easier than explicit programming or optimal control methods ● Trajectories can be represented compactly ● RBF approximation can represent complex (i.e. very non-linear) trajectories with high-fidelity using little storage

35 1/24 Robotic control via parametric primitives ● Precondition(s) for primitive must first be met ● Time duration for primitive then selected ● Primitive then executed open-loop ● closed-loop controllers investigated in future ● Control operates at kinematic level only ● position and/or velocity commands sent to low-level controller

36 1/24 Activity recognition via primitives ● Primitives serve to model a behavior ● This model can be used to recognize the behavior ● We built a Bayesian classifier to recognize a set of five primitives from mocap & simulator data ● rate of false negatives: 3.39% ● rate of false positives: 0.06% ● more data needed for validation

37 1/24 Markerless Kinematic Model and Motion Capture In addition, a kinematic model is estimated for the subject We leverage recent voxel carving techinques for constructing 3D point volumes of moving subjects in multiple calibrated cameras Voxel Carving

38 1/24 Nonlinear Spherical Shells NSS is a simple means for volume skeletonization Pose-independent principal curve Captured volume skeleton curve Original VolumePose-independent “Da Vinci” Zero Posture Spherical Shells Dimension ReductionPartitioning Clustering and Linking Projection on original volume

39 1/24 Model and Pose Estimation Using each volume and its skeleton curve, a kinematic model for each frame in a motion A single model is estimated for the sequence by  aligning frame-specific models  identifying common joints using density of aligned joints Alignment

40 1/24 Result for Human Waving

41 1/24 Result for Synthetic Volumes Original Kinematics and Motion Derived Kinematics and Motion Snapshot of the Synethtic Volume for a Single Frame

42 1/24 Conclusions Goal: use the behavior substrate to facilitate action- embedded human-robot interaction, control, and learning. Recent successes: Generalization of multiple (but few) demonstrations into a unique behavior network representation. Use of simple feedback cues for refining of learned tasks and faster learning. Automatic derivation of behaviors from human motion data. Some work in progress: Validation of generalization and human-robot interaction methods in elaborated experimental setups. Validation of the derivation method on Robonaut data. Info, videos, papers: http://robotics.usc.edu/projects/mars/


Download ppt "1/24 Human-Robot Interaction and Learning From Human Demonstration Maja J Matarić Chad Jenkins, Monica Nicolescu, Evan Drumwright, and Chi-Wei Chu University."

Similar presentations


Ads by Google