Robust Belief-based Execution of Manipulation Programs

Robust Belief-based Execution of Manipulation Programs
Kaijen Hsiao Tomás Lozano-Pérez Leslie Pack Kaelbling MIT CSAIL

Achieving Goals under Uncertainty
Two kinds of uncertainty: current state: need to plan in information space results of future actions: search branches on outcomes as well as actions Choice of action must be dependent on current information state

Discrete POMDP Formulation
states actions observations transition model observation model reward

 POMDP Controller Controller belief SE sensing action Environment
State estimation is discrete Bayesian filter Policy maps belief states to actions

Action selection in POMDPs
Off-line optimal policy generation Intractable for large spaces On-line search: finite-depth expansion of belief-space tree from current belief state to select single action Tractable in broad subclass of problems

Challenges for action selection
Continuous state spaces Requirement to select action for any belief state Long horizon Action branching factor Outcome branching factor Computationally complex observation and transition models

Grasping in uncluttered environments
Points of leverage: Robot pose is approximately observable Robot dynamics are nearly deterministic Bounded uncertainty over unobserved object parameters Room to maneuver

Online belief-space search
Continuous state space: discretize object state space

Discretize object configuration space
workspace configuration space belief state

Continuous state space: discretize object state space Action for any belief: search forward from current belief state

Search forward from current belief
Low entropy belief states enable reliable grasp Use entropy as static evaluation function at leaves Actions can be useful for information gathering

Continuous state space: discretize object state space Action for any belief: search forward from current belief state Long horizon: use temporally extended actions

Use temporally extended actions
Primitive actions Entire trajectories Reduce horizon Observations at end

Continuous state space: discretize object state space Action for any belief: search forward from current belief state Long horizon: use temporally extended actions Large action branching factor: parameterize small set of action types by current belief

Parameterize actions with belief
Actions are entire world-relative trajectories In current belief state, execute with respect to most likely object configuration terminate on contact or end of trajectory

Continuous state space: discretize object state space Action for any belief: search forward from current belief state Long horizon: use temporally extended actions Large action branching factor: parameterize small set of action types by current belief Computationally complex observation and transition models: precompute models

Precompute models Execute WRT with respect to estimated state e
in world state w Expected observation, transition Based on geometric simulation

Continuous state space: discretize object state space Action for any belief: search forward from current belief state Long horizon: use temporally extended actions Large action branching factor: parameterize small set of action types by current belief Computationally complex observation and transition models: precompute models Large observation branching factor: canonicalize observations for each discrete state and action

Canonicalize observations
Any (e, w) pair with same relative transformation has same world-relative outcomes and observations Only sample for one e with w varying within initial range of uncertainty Cluster observations and represent each bin of object configurations by a single representative one Only branch on canonical observations

Algorithm Off-line: plan WRTs for grasping and info gathering
compute models On-line: while current belief state doesn’t satisfy goal compute expected info gain of each WRT execute best WRT until termination use observation to update current belief return to initial pose execute final grasp trajectory

Application to grasping with simulated robot arm
Initial conditions (ultimately from vision) Object shape is roughly known (contacted vertices should be within ~1 cm of actual positions) Object is on table and pose (x, y, rotation) is roughly known (center of mass std ~5 cm, 30 deg) Achieve specific grasp of object

Observations Fingertips: 6-axis force/torque sensors position normal
Additional contact sensors: just contact Swept non-colliding path rules out poses that would have generated contact

Grasping a Box Most likely robot-relative position
Where it actually is

Initial belief state

Summed over theta

Tried to move down; finger hit corner

Probability of contact observation at each location

Updated belief

Re-centered

Trying again, with new belief
Back up Try again

Final state and observation
Observation probabilities Grasp

Updated belief state: Success!
Goal: variance < 1 cm x, 15 cm y, 6 deg theta

What if Y coord of grasp matters?

Need explicit information gathering

Simulation Experiments
Methods tested: Single open-loop execution of goal-achieving WRT with respect to the most likely state Repeated execution of goal-achieving WRT with respect to the most likely state Online selection of information-gathering and goal-achieving grasps (1-step lookahead)

Box experiments Allowed variation in goal grasp: 1 cm, 1 cm, 5 deg
Initial uncertainty: 5 cm, 5 cm, 30 deg

Cup experiments

Cup experiments Goal 1 cm x, 1 cm y, rotation doesn’t matter (no info-grasps used) Start uncertainty 30 deg theta (x,y varies) Increasing uncertainty

Grasping a Brita Pitcher
Target grasp: Put one finger through the handle and grasp

Brita Pitcher experiments

Brita Pitcher results Increasing uncertainty

Other recent probabilistic approaches to manipulation
Off-line POMDP solution for grasping (Hsiao et al. 2007) Bayesian state estimation using tactile sensors to locate object before grasping (Petrovskaya et al. 2006) Finding a fixed trajectory that is most likely to succeed under uncertainty (Alterovitz et al. 2007, Burns and Brock 2007)

The End.

Timing For Brita Pitcher
(2.16 GHz processor, 3.24 GB RAM running Python, times in seconds) 1 cm 3 deg 3 cm 9 deg 5 cm 15 deg 30 deg Grid size 5733 16337 14415 24025 Computing observation matrix (1 traj) 12 33 29 51 1st belief-state update 4 10 19 Choosing 1st info-grasp 9 17 30

Number of Actions Used 1 cm 3 deg 3 cm 9 deg 5 cm 15 deg 30 deg
Robust execution of target 1.9 2.5 3.3 3.5 Robust execution with info-grasps not run 4.4 4.1 4.2

Creating Information-gain Trajectories
Trajectory generation Generate endpoints, use randomized planner (such as OpenRAVE) to find nominal collision-free path Sweep through entire workspace Choose a small set based on information gain from start uncertainty

Robust Belief-based Execution of Manipulation Programs

Similar presentations

Presentation on theme: "Robust Belief-based Execution of Manipulation Programs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Robust Belief-based Execution of Manipulation Programs

Similar presentations

Presentation on theme: "Robust Belief-based Execution of Manipulation Programs"— Presentation transcript:

Similar presentations

About project

Feedback