CS b659: Intelligent Robotics

Slides:

Advertisements

Similar presentations

State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.

Advertisements

Markov Decision Process

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)

MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.

Probabilistic Robotics

1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.

Bayes Filters Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics TexPoint fonts used in EMF. Read the.

Planning under Uncertainty

POMDPs: Partially Observable Markov Decision Processes Advanced AI

Real-Time Tracking of an Unpredictable Target Amidst Unknown Obstacles Cheng-Yu Lee Hector Gonzalez-Baños* Jean-Claude Latombe Computer Science Department.

Autonomous Robot Navigation Panos Trahanias ΗΥ475 Fall 2007.

1 Target Tracking u u Real Time Tracking of an Unpredictable Target Amidst Unknown Obstacles by Cheng Yu Lee, Hector Gonzalez-Banos and Jean Claude Latombe.

Decision Making Under Uncertainty Russell and Norvig: ch 16, 17 CMSC421 – Fall 2005.

Making Decisions under Probabilistic Uncertainty (Where an agent optimizes what it gets on average, but it may get more... or less ) R&N: Chap. 17, Sect.

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Markov Decision Processes

CS 326 A: Motion Planning Target Tracking and Virtual Cameras.

Reinforcement Learning Yishay Mansour Tel-Aviv University.

Decision Making Under Uncertainty Russell and Norvig: ch 16, 17 CMSC421 – Fall 2003 material from Jean-Claude Latombe, and Daphne Koller.

CS B 659: I NTELLIGENT R OBOTICS Planning Under Uncertainty.

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

1 Endgame Logistics  Final Project Presentations  Tuesday, March 19, 3-5, KEC2057  Powerpoint suggested ( to me before class)  Can use your own.

Planning with Non-Deterministic Uncertainty. Recap Uncertainty is inherent in systems that act in the real world Last lecture: reacting to unmodeled disturbances.

Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.

1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.

Model-based Bayesian Reinforcement Learning in Partially Observable Domains by Pascal Poupart and Nikos Vlassis (2008 International Symposium on Artificial.

Agents CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

1 Distributed and Optimal Motion Planning for Multiple Mobile Robots Yi Guo and Lynne Parker Center for Engineering Science Advanced Research Computer.

Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 5.1: State Estimation Jürgen Sturm Technische Universität München.

Reinforcement Learning Yishay Mansour Tel-Aviv University.

yG7s#t=15 yG7s#t=15.

Real-Time Simultaneous Localization and Mapping with a Single Camera (Mono SLAM) Young Ki Baik Computer Vision Lab. Seoul National University.

Deciding Under Probabilistic Uncertainty Russell and Norvig: Sect ,Chap. 17 CS121 – Winter 2003.

Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.

Lecture 3: 18/4/1435 Searching for solutions. Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

State Estimation and Kalman Filtering

Robotics Club: 5:30 this evening

Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.

Decision Making Under Uncertainty CMSC 471 – Spring 2041 Class #25– Tuesday, April 29 R&N, material from Lise Getoor, Jean-Claude Latombe, and.

Probabilistic Robotics

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

Smart Sleeping Policies for Wireless Sensor Networks Venu Veeravalli ECE Department & Coordinated Science Lab University of Illinois at Urbana-Champaign.

Planning Under Uncertainty. Sensing error Partial observability Unpredictable dynamics Other agents.

Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.

Solving problems by searching Chapter 3. Types of agents Reflex agent Consider how the world IS Choose action based on current percept Do not consider.

Solving problems by searching

Intelligent Agents (Ch. 2)

Analytics and OR DP- summary.

Reinforcement Learning in POMDPs Without Resets

ECE 448 Lecture 4: Search Intro

Schedule for next 2 weeks

Problem Solving as Search

Simultaneous Localization and Mapping

Intelligent Agents (Ch. 2)

Markov ó Kalman Filter Localization

Markov Decision Processes

Locomotion of Wheeled Robots

Robust Belief-based Execution of Manipulation Programs

Navigation In Dynamic Environment

Markov Decision Processes

Announcements Homework 3 due today (grace period through Friday)

Real-Time Motion Planning

Probabilistic Map Based Localization

October 6, 2011 Dr. Itamar Arel College of Engineering

Solving problems by searching

CS 416 Artificial Intelligence

Reinforcement Learning Dealing with Partial Observability

Deciding Under Probabilistic Uncertainty

Solving problems by searching

Presentation transcript:

CS b659: Intelligent Robotics Planning Under Uncertainty

Offline learning (e.g., calibration) Prior knowledge (often probabilistic) Sensors Perception (e.g., filtering, SLAM) Decision-making (control laws, optimization, planning) Actions Actuators

Dealing with Uncertainty Sensing uncertainty Localization error Noisy maps Misclassified objects Motion uncertainty Noisy natural processes Odometry drift Imprecise actuators Uncontrolled agents (treat as state)

Motion Uncertainty s obstacle obstacle g obstacle

Dealing with Motion Uncertainty Hierarchical strategies Use generated trajectory as input into a low-level feedback controller Reactive strategies Approach #1: re-optimize when the state has diverged from the planned path (online optimization) Approach #2: precompute optimal controls over state space, and just read off the new value from the perturbed state (offline optimization) Proactive strategies Explicitly consider future uncertainty Heuristics (e.g., grow obstacles, penalize nearness) Markov Decision Processes Online and offline approaches

Proactive Strategies to handle Motion Uncertainty obstacle obstacle g obstacle

Dynamic collision avoidance assuming worst-case behaviors Optimizing safety in real- time under worst case behavior model T=1 T=2 TTPF=2.6 Robot

Markov Decision Process Approaches Alterovitz et al 2007

Dealing with Sensing Uncertainty Reactive heuristics often work well Optimistic costs Assume most-likely state Penalize uncertainty Proactive strategies Explicitly consider future uncertainty Active sensing Reward actions that yield information gain Partially Observable Markov Decision Processes (POMDPs)

Assuming no obstacles in the unknown region and taking the shortest path to the goal

Assuming no obstacles in the unknown region and taking the shortest path to the goal

Assuming no obstacles in the unknown region and taking the shortest path to the goal Works well for navigation because the space of all maps is too huge, and certainty is monotonically nondecreasing

Assuming no obstacles in the unknown region and taking the shortest path to the goal Works well for navigation because the space of all maps is too huge, and certainty is monotonically nondecreasing

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)?

What if the sensor was directed (e.g., a camera)? At this point, it would have made sense to turn a bit more to see more of the unknown map

Another Example ? Locked? Key? Key?

Active Discovery of Hidden Intent No motion Perpendicular motion

Main Approaches to Partial Observability Ignore and react Doesn’t know what it doesn’t know Model and react Knows what it doesn’t know Model and predict Knows what it doesn’t know AND what it will/won’t know in the future Better decisions More components in implementation Harder computationally

Uncertainty models Model #1: Nondeterministic uncertainty f(x,u) -> a set of possible successors Model #2: Probabilistic uncertainty P(x’|x,u): a probability distribution over successors x’, given state x, control u Markov assumption

Nondeterministic Uncertainty : Reasoning with sets x’ = x + e,   [-a,a] t=0 t=1 -a a t=2 -2a 2a Belief State: x(t)  [-ta,ta]

Uncertainty with Sensing Plan = policy (mapping from states to actions) Policy achieves goal for every possible sensor result in belief state Observations should be chosen wisely to keep branching factor low Move 1 2 Sense 3 4 Outcomes Special case: fully observable state

Target Tracking target robot The robot must keep a target in its field of view The robot has a prior map of the obstacles But it does not know the target’s trajectory in advance

Target-Tracking Example robot Time is discretized into small steps of unit duration At each time step, each of the two agents moves by at most one increment along a single axis The two moves are simultaneous The robot senses the new position of the target at each step The target is not influenced by the robot (non-adversarial, non-cooperative target)

Time-Stamped States (no cycles possible) ([i,j], [u,v], t) ([i+1,j], [u,v], t+1) ([i+1,j], [u-1,v], t+1) ([i+1,j], [u+1,v], t+1) ([i+1,j], [u,v-1], t+1) ([i+1,j], [u,v+1], t+1) right State = (robot-position, target-position, time) In each state, the robot can execute 5 possible actions : {stop, up, down, right, left} Each action has 5 possible outcomes (one for each possible action of the target), with some probability distribution [Potential collisions are ignored for simplifying the presentation]

Rewards and Costs The robot must keep seeing the target as long as possible Each state where it does not see the target is terminal The reward collected in every non-terminal state is 1; it is 0 in each terminal state [ The sum of the rewards collected in an execution run is exactly the amount of time the robot sees the target] No cost for moving vs. not moving

Expanding the state/action tree ... horizon 1 horizon h

Assigning rewards Terminal states: states where the target is not visible Rewards: 1 in non-terminal states; 0 in others But how to estimate the utility of a leaf at horizon h? ... horizon 1 horizon h

Estimating the utility of a leaf target robot d ... Compute the shortest distance d for the target to escape the robot’s current field of view If the maximal velocity v of the target is known, estimate the utility of the state to d/v [conservative estimate] horizon 1 horizon h

Selecting the next action Compute the optimal policy over the state/action tree using estimated utilities at leaf nodes Execute only the first step of this policy Repeat everything again at t+1… (sliding horizon) ... horizon 1 horizon h

Pure Visual Servoing FOR EXAMPLE< ALWAYS KEEP TARGET AT A FIXED DISTANCE DIRECTLY IN FORNT OF OBSERVER NOT ENOUGH TO JUST USE PURE SERVOING KNOW TARGET’S NXT MOVE AND PLAN MOTION

Computing and Using a Policy ALSO ANALYSIS FOR TARGET MOTION AND PLAN MOTION

Next week More algorithm details

Final information Final presentations Final reports due 5/3 20 minutes + 10 minutes questions Final reports due 5/3 Must include a technical report Introduction Background Methods Results Conclusion May include auxiliary website with figures / examples / implementation details I will not review code