Continuous Time and Resource Uncertainty CSE 574 Lecture Spring ’03 Stefan B. Sigurdsson.

Slides:



Advertisements
Similar presentations
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Advertisements

CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.
Top 5 Worst Times For A Conference Talk 1.Last Day 2.Last Session of Last Day 3.Last Talk of Last Session of Last Day 4.Last Talk of Last Session of Last.
Probabilistic Planning (goal-oriented) Action Probabilistic Outcome Time 1 Time 2 Goal State 1 Action State Maximize Goal Achievement Dead End A1A2 I A1.
1 Graphplan José Luis Ambite * [* based in part on slides by Jim Blythe and Dan Weld]
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.
Probabilistic Planning Jim Blythe November 6th. 2 CS 541 Probabilistic planning A slide from August 30th: Assumptions (until October..) Atomic time All.
Planning and Scheduling. 2 USC INFORMATION SCIENCES INSTITUTE Some background Many planning problems have a time-dependent component –  actions happen.
Planning CSE 473 Chapters 10.3 and 11. © D. Weld, D. Fox 2 Planning Given a logical description of the initial situation, a logical description of the.
Probabilistic Planning Jim Blythe. 2 CS 541 Probabilistic planning Some ‘classical planning’ assumptions Atomic time All effects are immediate Deterministic.
A Hybridized Planner for Stochastic Domains Mausam and Daniel S. Weld University of Washington, Seattle Piergiorgio Bertoli ITC-IRST, Trento.
Planning under Uncertainty
Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.
Markov Decision Processes CSE 473 May 28, 2004 AI textbook : Sections Russel and Norvig Decision-Theoretic Planning: Structural Assumptions.
1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.
Planning CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanning Supervised.
Handling non-determinism and incompleteness. Problems, Solutions, Success Measures: 3 orthogonal dimensions  Incompleteness in the initial state  Un.
Concurrent Probabilistic Temporal Planning (CPTP) Mausam Joint work with Daniel S. Weld University of Washington Seattle.
Planning II CSE 473. © Daniel S. Weld 2 Logistics Tournament! PS3 – later today Non programming exercises Programming component: (mini project) SPAM detection.
Planning with Incomplete, Unbounded Information May 20, 2003 Tal Shaked.
Probabilistic Temporal Planning with Uncertain Durations Mausam Joint work with Daniel S. Weld University of Washington Seattle.
CS121 Heuristic Search Planning CSPs Adversarial Search Probabilistic Reasoning Probabilistic Belief Learning.
1 Planning Chapters 11 and 12 Thanks: Professor Dan Weld, University of Washington.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Classical Planning via State-space search COMP3431 Malcolm Ryan.
Planning II CSE 573. © Daniel S. Weld 2 Logistics Reading for Wed Ch 18 thru 18.3 Office Hours No Office Hour Today.
Policy Generation for Continuous-time Stochastic Domains with Concurrency Håkan L. S. YounesReid G. Simmons Carnegie Mellon University.
Classical Planning Chapter 10.
1 Endgame Logistics  Final Project Presentations  Tuesday, March 19, 3-5, KEC2057  Powerpoint suggested ( to me before class)  Can use your own.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Homework 1 ( Written Portion )  Max : 75  Min : 38  Avg : 57.6  Median : 58 (77%)
CSE-573 Reinforcement Learning POMDPs. Planning What action next? PerceptsActions Environment Static vs. Dynamic Fully vs. Partially Observable Perfect.
TKK | Automation Technology Laboratory Partially Observable Markov Decision Process (Chapter 15 & 16) José Luis Peralta.
Computing & Information Sciences Kansas State University Lecture 22 of 42 CIS 530 / 730 Artificial Intelligence Lecture 22 of 42 Planning: Sensorless and.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Automated Planning Dr. Héctor Muñoz-Avila. What is Planning? Classical Definition Domain Independent: symbolic descriptions of the problems and the domain.
CSE473 Winter /04/98 State-Space Search Administrative –Next topic: Planning. Reading, Chapter 7, skip 7.3 through 7.5 –Office hours/review after.
AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)
Intro to Planning Or, how to represent the planning problem in logic.
Classical Planning Chapter 10 Mausam / Andrey Kolobov (Based on slides of Dan Weld, Marie desJardins)
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Ames Research Center Incremental Contingency Planning Richard Dearden, Nicolas Meuleau, Sailesh Ramakrishnan, David E. Smith, Rich Washington window [10,14:30]
Ames Research Center Planning with Uncertainty in Continuous Domains Richard Dearden No fixed abode Joint work with: Zhengzhu Feng U. Mass Amherst Nicolas.
Heuristic Search for problems with uncertainty CSE 574 April 22, 2003 Mausam.
1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making Graphplan Based on slides by: Ambite, Blyth and.
Graphplan CSE 574 April 4, 2003 Dan Weld. Schedule BASICS Intro Graphplan SATplan State-space Refinement SPEEDUP EBL & DDB Heuristic Gen TEMPORAL Partial-O.
Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
Computing & Information Sciences Kansas State University Wednesday, 04 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 17 of 42 Wednesday, 04 October.
Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.
Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.
Computing & Information Sciences Kansas State University Friday, 13 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 21 of 42 Friday, 13 October.
Classical Planning via State-space search
CS b659: Intelligent Robotics
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Reinforcement learning (Chapter 21)
Temporal Graphplan David E. Smith & Dan Weld.
Review for the Midterm Exam
Markov Decision Processes
Markov Decision Processes
Planning José Luis Ambite.
Planning CSE 573 A handful of GENERAL SEARCH TECHNIQUES lie at the heart of practically all work in AI We will encounter the SAME PRINCIPLES again and.
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
[* based in part on slides by Jim Blythe and Dan Weld]
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Continuous Time and Resource Uncertainty CSE 574 Lecture Spring ’03 Stefan B. Sigurdsson

(Big Mars Rover Picture)

Lecture Overview Context –Classical planning –The Mars Rover domain –Relaxing the assumptions –Q: What’s so different? Innovation Discussion

(Shakey Picture) Slide shamelessly lifted from

STRIPS-Like Planning Propositional logic Closed world assumption Finite and static Complete knowledge Discrete time No exogenous effects World Description Attainment – “Win or lose” Conjunctions of positive literals Goal Description Conjunctive precondition STRIPS operators Conj. effect (add/delete) Instantaneous Sequential Deterministic Actions  Plan…

(Big Mars Rover Picture)

The Mars Rover Domain Robot control, with… –Positioning and navigation –Complex choices (goals and actions) –Rich utility model –Continuous time and concurrency –Uncertain resource consumption –Metric quantities –Very high stakes! But alone in a finite, static universe

Resources? Metric Quantities? What are those? Various flavors: –Exclusive (camera arm) –Shared (OS scheduling) –Metric quantity (fuel, power, disk space) Uncertainty

Alright, Whatsit Really Mean?

Is This Really A Planning Problem? Better suited to OR/DT-type scheduling? –Time, resources, metric quantities, concurrency, complicated goals/rewards… Complex, inter-dependent activities –Select, calibrate, use, reuse, recalibrate sensors –OR-type scheduling can’t handle rich choices Insight: Maybe we can borrow some tricks?

Can Planners Scale Up? Large plans –Sequences of ~ 100 actions Where do we start? –POP? –MDP? –Graph/SATplan?

Can Planners Scale Up? Large plans –Sequences of ~ 100 actions Where do we start? –POP? (Branch factors are too big) –MDP? –Graph/SATplan?

Can Planners Scale Up? Large plans –Sequences of ~ 100 actions Where do we start? –POP? (Branch factors are too big) –MDP? (Complete policy is too large) –Graph/SATplan?

Can Planners Scale Up? Large plans –Sequences of ~ 100 actions Where do we start? –POP? (Branch factors are too big) –MDP? (Complete policy is too large) –Graph/SATplan? (Discrete representations)

Which Extensions First? Metric quantities –Time –Resources Resource Uncertainty Concurrency  What about non-determinism?  Reasonable for Graphplan?

A (Very Incomplete) Research Timeline 1971 STRIPS (Fikes/Nilson) 1989 ADL (Pednault) 1991 PEDESTAL (McDermott) 1992 UCPOP (Penberthy/Weld) 1992 SENSp (Etzioni et al.) CNLP (Peot/Smith) 1993 Buridan (Kushmerick et al.) 1994 C-Buridan (Draper et al.) JIC Scheduling (Drummond et al.) HSTS (Muscettola) Zeno (Penb./Weld) Softbots (Weld/Etzioni) MDP (Williamson/Hanks) 1995 DRIPS (Haddawy et al.) IxTeT (Laborie/Ghallab) 1997 IPP (Koehler et al.) Not implemented  ADL impl. Sensing Conformant Contingent Planning + scheduling Metric time/resources Safe planning Dec. theory goals Uncertain utility Shared resources 1998 PGraphplan (Blum/Langford) Weaver (Blythe) PUCCINI (Golden) CGP (Smith/Weld) SGP (Weld et al.) Pgraphplan (Blum/Langford) 1999 Mahinur (Onder/Pollack) ILP-PLAN (Kautz/Walzer) TGP (Smith/Weld) LPSAT (Wolfman/Weld) 2000 T-MDP (Boyan/Littman) HSTS/RA (Jónsson et al.) Since then? Uncertain/dynamic Sensing Conformant Contingent Resources

Domain Assumptions Expressive logic Non-determinism Observation Goal model Plan utility Durative actions Complex concurrence Continuous time Metric quantities Branching factor Resource uncertainty Resource constraints Goal selection Safe planning Exogenous events STRIPS UCPOP CGP CNLP SENSp Buridan Weaver C-Buridan MDP PO-MDP S-MDP T-MDP F-MDP LPSAT Mars Rover Classical Bleeding edge Select contingencies Serialized goals?

Brain-teaser: Domain Spec State space S –Cartesian product of continuous and discrete axes (time, position, achievements, energy…) Initial state s i –Probability distribution Domain theory  –Concurrent, non-deterministic, uncertain  What else? (S, s i, , …)

Brain-teaser: Kalman Filters Curiously missing from the paper we read (?) 1983 Kalman filters paper: Voyager enters Jupiter orbit through a 30 second window after 11 years in space Hugh Durrant-Whyte’s robots Why not for the Mars Rover?

Context Summary Complex, exciting domain Pushes the planning envelope –Expression –Scaling  Where do we start?

Lecture Overview Context Innovation –Just-in-case planning –Incremental contingency planning Discussion

Just-In-Case Planning Motivated by domain characteristics –Metric quantities –Large branch factors Implications –Not plan, not policy –Expanded plan  What about concurrency?

Branch Heuristics Most probable failure point (scheduling) Highest utility branch point (planning)  What is the intrinsic difference?

When To Execute A Contingency?

Incremental Contingency Planning Algorithm Input: Domain description and master plan Output: Highest-utility branch point Algorithm: –Compute value, estimate resources during master plan –Approximate branch point utilities –Select highest-utility branch point –Solve w/ new initial, goal conditions –Repeat while necessary

Branch Utility Approximation … without constructing plan –Construct a plan graph –Back-propagate utility functions through plan graph, instead of regression searching –Compute branch point utilities throughout input plan

Back-Propagating Distributions Mausam: “Some parts of the paper are tersely written, which make it a little harder to understand. I got quite confused in the discussion of utility propagation. It would have been nicer had they given some theorems about the soundness of their method.” Well, me too

Back-Propagating Distributions A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ 1 5 Back-Propagating Distributions

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ 1 5 Back-Propagating Distributions 5 15

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ Back-Propagating Distributions

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ Back-Propagating Distributions

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r 1 2 t Back-Propagating Distributions

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t Back-Propagating Distributions

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t t Back-Propagating Distributions 5 25

15 A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t t + Back-Propagating Distributions 1 5 t 25 6

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t r +

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t (CDE)

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions t 1 5 t t (CDE)

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions 5 t t [(CDE) (ABDE)] [(DCE) (AB) (DABE)]

A C D B E (1, 5) (3, 3) (10, 15) (2, 2) p s q r t g g’ r r 1 2 t 1 2 Back-Propagating Distributions 5 t t (CDE, ABDE) (DCE, AB, DABE) 5

Utility Estimation p s 1 8 (CDE, ABDE) (DCE, AB, DABE) 5

Utility Estimation p s 1 8 (CDE, ABDE) (DCE, AB, DABE) (DCE, ABDE) MAX operator:

Utility Estimation p s 1 8 (CDE, ABDE) (DCE, AB, DABE) (DCE, ABDE) MAX operator: (Then combine w/Monte Carlo results)

Lecture Overview Context Innovation Discussion –Q: Evaluation? Inference?

Evaluation Optimal branch selection? (Greedy…)

Incremental Contingencies… Sometimes adding one contingency at a time is non-optimal  Examples?

Incremental Contingencies… Rain Shine Work Go climbing Exercise Sometimes adding one contingency at a time is non-optimal

Evaluation Optimal branch selection? What else?

Inference Where can we take these ideas? What can we add to them?

Inference Where can we take these ideas? What can we add to them? Optimal branch selection Optimistic branching Mutexes in plan graph Noisy/costly sensors

Review – Alex Yates “I don't quite understand why this planning problem is called contingent planning, since they assume full observability. That conflicts with the previous notion of contingent planning that we've seen. The reason for branches in these plans aren't because there is uncertainty in what state the planner is in, but because it's impossible to enumerate all of the different outcomes. Also, the paper never discusses probabilistic effects for state transitions other than the ones effecting the continuous attributes. I wasn't sure if this meant that all actions were deterministic except in their resource consumption, or if it meant that any attributes which might have probabilistic transitions would need to be treated as resources.”