Probabilistic Planning Jim Blythe. 2 CS 541 Probabilistic planning Some ‘classical planning’ assumptions Atomic time All effects are immediate Deterministic.

Slides:



Advertisements
Similar presentations
Planning with Non-Deterministic Uncertainty (Where failure is not an option) R&N: Chap. 12, Sect (+ Chap. 10, Sect 10.7)
Advertisements

Making Simple Decisions
Causal-link Planning II José Luis Ambite. 2 CS 541 Causal Link Planning II Planning as Search State SpacePlan Space AlgorithmProgression, Regression POP.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Markov Decision Process
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
Partially Observable Markov Decision Process (POMDP)
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)
THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L09: Graphical Models for Decision Problems Nevin.
Probabilistic Planning Jim Blythe November 6th. 2 CS 541 Probabilistic planning A slide from August 30th: Assumptions (until October..) Atomic time All.
Decision Theoretic Planning
Planning CSE 473 Chapters 10.3 and 11. © D. Weld, D. Fox 2 Planning Given a logical description of the initial situation, a logical description of the.
1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
Engineering Economic Analysis Canadian Edition
Planning under Uncertainty
POMDPs: Partially Observable Markov Decision Processes Advanced AI
CPSC 322, Lecture 35Slide 1 Finish VE for Sequential Decisions & Value of Information and Control Computer Science cpsc322, Lecture 35 (Textbook Chpt 9.4)
Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.
Artificial Intelligence Chapter 11: Planning
3/25  Monday 3/31 st 11:30AM BYENG 210 Talk by Dana Nau Planning for Interactions among Autonomous Agents.
GraphPlan ● (you don't have to know anything about this, but I put it here to point out that there are many clever approaches to try to find plans more.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
CS121 Heuristic Search Planning CSPs Adversarial Search Probabilistic Reasoning Probabilistic Belief Learning.
Solving problems by searching
Decision Making Under Uncertainty Russell and Norvig: ch 16, 17 CMSC421 – Fall 2003 material from Jean-Claude Latombe, and Daphne Koller.
CMSC 671 Fall 2003 Class #26 – Wednesday, November 26 Russell & Norvig 16.1 – 16.5 Some material borrowed from Jean-Claude Latombe and Daphne Koller by.
PLANNING Partial order regression planning Temporal representation 1 Deductive planning in Logic Temporal representation 2.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Classical Planning Chapter 10.
MAKING COMPLEX DEClSlONS
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Overview  Decision processes and Markov Decision Processes (MDP)  Rewards and Optimal Policies  Defining features of Markov Decision Process  Solving.
CSE-473 Artificial Intelligence Partially-Observable MDPS (POMDPs)
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Engineering Economic Analysis Canadian Edition
TKK | Automation Technology Laboratory Partially Observable Markov Decision Process (Chapter 15 & 16) José Luis Peralta.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Quiz 4 : Minimax Minimax is a paranoid algorithm. True
Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.
Deciding Under Probabilistic Uncertainty Russell and Norvig: Sect ,Chap. 17 CS121 – Winter 2003.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making Partial Order Planning Based on slides by: Carmel.
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
Intro to Planning Or, how to represent the planning problem in logic.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
Planning I: Total Order Planners Sections
Planning Under Uncertainty. Sensing error Partial observability Unpredictable dynamics Other agents.
Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.
Causal-link planning 1 Jim Blythe. 2 USC INFORMATION SCIENCES INSTITUTE Causal Link Planning The planning problem Inputs: 1. A description of the world.
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
Partial Observability “Planning and acting in partially observable stochastic domains” Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra;
Nevin L. Zhang Room 3504, phone: ,
Last time: search strategies
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
A New Algorithm for Computing Upper Bounds for Functional EmajSAT
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Planning José Luis Ambite.
Decision Theory: Single Stage Decisions
Chapter 6 Planning-Graph Techniques
Causal-link planning 2 Jim Blythe.
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
Search.
Search.
Unit II Game Playing.
Presentation transcript:

Probabilistic Planning Jim Blythe

2 CS 541 Probabilistic planning Some ‘classical planning’ assumptions Atomic time All effects are immediate Deterministic effects Omniscience Sole agent of change is the planning agent Goals of attainment

3 CS 541 Probabilistic planning Some ‘classical planning’ assumptions Atomic time (10/02/03) All effects are immediate (10/02/03) Deterministic effects Omniscience (10/28/03) Sole agent of change (10/16/03) Goals of attainment (11/13/03)

4 CS 541 Probabilistic planning Sources of uncertainty When we try to execute plans, uncertainty from several different sources can affect success. ? ? Firstly, we might have uncertainty about the state of the world

5 CS 541 Probabilistic planning Sources of uncertainty Actions we take might have uncertain effects even when we know the world state ? ?

6 CS 541 Probabilistic planning Sources of uncertainty External agents might be changing the world while we execute our plan. MeXX

7 CS 541 Probabilistic planning Dealing with uncertainty: re-planning Make a plan assuming nothing bad will happen Monitor for problems during execution Build a new plan if a problem is found  Either re-plan to the goal state  Or try to patch the existing plan

8 CS 541 Probabilistic planning Dealing with uncertainty: Conditional planning Deal with contingencies (bad outcomes) at planning time before they occur Universal planning might be viewed as conditional planning where every possible contingency is covered (somehow) in the policy.

9 CS 541 Probabilistic planning Tradeoffs in strategies for uncertainty My re-planner housemate: “Why are you taking an umbrella? It’s not raining!”  Can’t find plans that require steps taken before the contingency is discovered My conditional planner housemate: “Why are you leaving the house? Class may be cancelled. It might rain. You might have won the lottery. Was that an earthquake?….”  Impossible to plan for every contingency. Need a representation that captures tradeoffs.

10 CS 541 Probabilistic planning Probabilistic planning lets us explore the middle ground Partial knowledge about our uncertainties: different contingencies have different probabilities of occurring. Plan ahead for likely contingencies that may need steps taken before they occur. Use probability theory to judge plans that address some contingencies: seek a plan that is above some minimum probability of success.

11 CS 541 Probabilistic planning Some issues to think about How do we figure out the probability of a plan succeeding? Is it expensive to do? How do we know what the most likely contingencies are? Can we distinguish bad outcomes (not holding the cup) from really bad outcomes (broken the cup, spilled the sulfuric acid..)?

12 CS 541 Probabilistic planning More issues Why not just do all this with MDPs/POMDPs?  We’ll look at MDP-based approaches in a later class  With MDPs, need to control potential state space explosion  Most approaches (of both kinds) use compact action representations  MDP-based approaches can find optimal solutions and deal elegantly with costs.

13 CS 541 Probabilistic planning Representing actions with uncertain outcomes

14 CS 541 Probabilistic planning Reminder: POP algorithm POP((A, O, L), agenda, PossibleActions): If agenda is empty, return (A, O, L) Pick (Q, An) from agenda choose an action Ad that adds Q. If no such action exists, fail. Add the link Ad Ac to L and the ordering Ad < Ac to O If Ad is new, add it to A. Remove (Q, An) from agenda. If Ad is new, for each of its preconditions P add (P, Ad) to agenda. For every action At that threatens any link Choose to add At < Ap or Ac < At to O. If neither choice is consistent, fail. POP((A, O, L), agenda, PossibleActions) Q Ap Ac Q

15 CS 541 Probabilistic planning Buridan (an SNLP-based planner) An SNLP-based planner might come up with this plan for a deterministic action representation:

16 CS 541 Probabilistic planning A plan that works 70% of the time..

17 CS 541 Probabilistic planning Modifications to the UCPOP algorithm Allow more than one causal link for each condition in the plan. Confront a threat by decreasing the probability that it will happen. (By adding conditions negating the trigger of the threat). Terminate when sufficient probability reached (may still have threats).

18 CS 541 Probabilistic planning Computing probability of plan success 1: forward projection Simulate the plan, keep track of possible states and their probabilities, finally sum the probabilities of states that satisfy the goal. Here, the china is packed in the initial state with probability 0.5 (and is not packed with probability 0.5) What is the worst-case time complexity of this algorithm?

19 CS 541 Probabilistic planning Computing the probability of success 2: Bayes nets Time-stamped literal node Action outcome node What is the worst-case time complexity of this algorithm?

20 CS 541 Probabilistic planning Tradeoffs in computing probability of success Belief net approach is often faster because it ignores irrelevant differences in the state. Neither approach is guaranteed to be faster. Often, the time to compute the probability of success dominates the planning time.

21 CS 541 Probabilistic planning Conditional planning in this framework: CNLP and C-Buridan Tricky to represent conditional branches in partially-ordered plans. Actions can produce “observation labels” as well as effects, e.g. “the weather is good”. After introducing an action with observation labels, the possible values can be used as “context labels” assigned to actions ordered after the observation step.

22 CS 541 Probabilistic planning Example: drive around the mountain

23 CS 541 Probabilistic planning DRIPS: (Decision-theoretic Refinement Planner) Considers plan utility, taking into account action costs, benefits of different states. Searches for a plan with Maximum Expected Utility (MEU), not just above a threshold. A skeletal planner, makes use of ranges of utility of abstract plans in order to search efficiently. Prune abstract plans whose utility range is completely below the range of some alternative (dominated plans)

24 CS 541 Probabilistic planning Abstract action for moving china Utility ranges used to compute dominance between alternative plans.

25 CS 541 Probabilistic planning Abstract plan

26 CS 541 Probabilistic planning MAXPLAN Inspired by SATPLAN. Compile planning problem to an instance of E-MAJSAT E-MAJSAT: given a boolean formula with variables that are either choice variables or chance variables, find an assignment to the choice variables that maximizes the probability that the formula is true. Choice variables: we can control them  e.g. which action to use Chance variables: we cannot control them  e.g. the weather, the outcome of each action,.. Then use standard algorithm to compute and maximize probability of success

27 CS 541 Probabilistic planning Example operators with chance variables

28 CS 541 Probabilistic planning Clauses with chance variables Solved with Davis-Putnam-Logemann-Loveland algm

29 CS 541 Probabilistic planning Thinking about MAXPLAN As it stands, does MAXPLAN build conditional plans? How could we make MAXPLAN build conditional plans?

30 CS 541 Probabilistic planning Other approaches that have been used Graphplan (pointers to Weld and Smith’s work in paper) Prodigy (more in next class) HTN planning (Cypress) Markov decision problems (more in the class after next)

31 CS 541 Probabilistic planning With all approaches, we must consider the same issues Tractability  Plans can have many possible outcomes  How to reason about when to add sensing Plan utility  Is probability of success enough?  What measures of cost and benefit can be used tractably?  Can operator costs be summed? What difference do time- based utilities like deadlines make? Observability and conditional planning  Classical planning is “open-loop” with no sensing  A “policy” assumes we can observer everything  Can we model limited observability, noisy sensors, bias..?