Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metric/Temporal Planning. Metric Temporal Planning  MTP Adds time and resources to planning Special cases: TP: Temporal planning RP: Resource Planning.

Similar presentations


Presentation on theme: "Metric/Temporal Planning. Metric Temporal Planning  MTP Adds time and resources to planning Special cases: TP: Temporal planning RP: Resource Planning."— Presentation transcript:

1 Metric/Temporal Planning

2 Metric Temporal Planning  MTP Adds time and resources to planning Special cases: TP: Temporal planning RP: Resource Planning

3 Issues with Time  Changes brought by the introduction of time into Planning can be grouped into two categories  Changes brought by having a metric (clock) time  I.e., there is a clock with respect to which we can specify events  Changes brought by durations to actions  I.e., actions are not instantaneous  Without metric time, a plan has just a beginning and ending point. Metric time allows us to talk about all time points (and intervals) during the execution of the plan. Changes brought by metric time include  Exogenous events  Special case: Timed initial literals (in the initial state we can state that some fluent becomes true at a specific time in the future during execution)  Deadline goals  We can state that different goals need to be made true by different deadline times (instead of all goals being true at the end)  Durative goals  We can state that a certain fluent must have a specific value over an entire interval

4 Issues with Time (contd.)  Durations of actions may be static or “dynamic”  duration depends on the context—eg. Time to fill your gas tank depends on how empty the tank is to begin with  Advanced issues: Uncertain durations…  With instantaneous actions, an action has just “before” and “after” –preconditions must hold “before” and effects will hold “after. Durative actions have before, after as well as “during”. With metric time (i.e., external clock), we can refer to all these points. We can now ask:  When are preconditions needed?  Are they needed at a single point or over a duration?  When are effects given? Are they point effects or “durative” effects (which are guaranteed over a certain duration)?  Note that because actions have durations, they can have multiple effects on a single fluent at different times  E.g. the action can make fluent P true at start, false after 10 sec, true again after another 10 sec etc.  A default assumption is to say that all preconditions are needed at the beginning and must hold during the entire action’s duration. And that all effects will be available at the end of the action  E.g Consider “Grading homeworks” action—when are the homeworks needed? When are the grades available? What does your teacher tell you?

5 Issues with Time (contd.)  Durative actions bring more pointed meaning to “concurrency”.  Concurrency is not just a luxury (to reduce make-span), but is often a necessity (e.g. burn a match, and cross the dark corridor while the match is burning..)  Suppose I tell you that a plan P contains actions A1… A10, each with duration d1…d10, then what is the makespan (execution duration) of P?  Makespan(P) >= max(d1…d10)  If Makespan(P) = Sum(d1…d10), then it is a strictly serial plan  If Makespan(P) > Sum(d1..d10), then there is idle-time in the plan  If Makespan(P) < Sum(d1..d10), then there is concurrency  Actions don’t need to start right after the preceding action  Think of the bank teller gossiping with his colleague in between servicing each customer  Planned idle/slack time may not always be a bad thing—it can sometimes improve the robustness of the plan  Think of three travel plans involving connections in Minneapolis: Plan 1 schedules 5 min for connection time; plan 2 schedules 1 hour; plan 3 schedules 2 days. Which one is better (all else being equal).

6 Issues with Resources (continuous quantities)  Resources: Actions may consume/produce (continuous quantity) “resources”  The main consequence is that we have numeric state variables, instead of just boolean (or multi-valued) ones (multi-valued does not mean numeric—a variable can take red,blue,green as values).  Actions can update a numeric state variable (whereas they just assign a non- numeric one)  Resource qty after action := Some-function-of(Resource qty before action, action parameters)  Updates can be linear OR non-linear  When combined with durative actions, updates can be discrete (i.e, happen all at once at the end of the action) OR continuous (or happen at some given rate during the action)  Planning issues: How to efficiently reason with continuous quantities during planning

7 PDDL 2.1 Standard: Summary  Durations  Static and dynamic durations allowed  Also allows duration inequalities  Preconditions  Can be “at start” or “over all” (throughout the duration)  Doesn’t model preconditions being needed for arbitrary durations in the middle  Effects  Can be “at start” or “at end”  This makes effects “discrete”  Numeric quantities  Can be present in the preconditions or effects  Presence in the effects can be “discrete” (“at start”/”at end”) or continuous  Continuous change specified by giving a “rate” at which the quantity changes  Non-linear rate harder

8 (:durative-action burn_match :parameters () :duration (= ?duration 15) :condition: (and (at start have_match) (at start have_strikepad)) :effect (and (at start have_light) (at end (not have_light)) ) have_match, have strikepad have_light ~have_light BURN MATCH(dur: 15) (:durative-action cross_cellar :parameters () :duration (= ?duration 10) :condition (and (at start have_light) (over all have_light) (at start at_steps)) :effect (and (at start (not at_steps)) (at start crossing)(at end at_fuse_box)) CROSS_CELLAR (dur: 10) have_light, at_steps at_fuse_box ~at_steps, crossing PDDL 2.1 (Level 2) Pure Durative Actions

9 PDDL 2.1 Level 3: Durative actions and numeric quantities (but discrete effects) The entire energy to be consumed is “encumbered” at the very beginning (even though it gets consumed Slowly over the full duration.

10 PDDL 2.1 Level 4: Durative actions and numeric quantities (with continuous effects: )

11 Issues in modeling continuous change by discrete vs. continuous effects  Consider the action of boiling a pan of water  The quantity “temperature of water” changes continuously over the duration of the action  We can ignore continuous effects by specifying that temperature is 100 0 C at the end  Easy to handle; can only access the temperature at the end of the action; Reduces concurrency (what if we also put a blow torch to the pan to “hasten” the process?)  Or we can specify that the temperature of the water raises at a linear rate until it becomes 100  Harder to handle; but allows more concurrency (the total rate of increase is summation of all the individual rates of increase)

12 Compiling durative actions into instantaneous ones  A durative action A that has only at-start, at-end and over-all conditions can be modeled in terms of two coupled instantaneous actions A s and A e  A s gets all the at-start conditions and effects  A e gets all the at-end conditions and effects  An “invariant” (think of it as an Interval Preservation Constraint) from A s to A e for all the “overall” preconditions +e s psps +e e pepe popo AsAs A AeAe +e s psps +e e pepe popo

13 Plan representation A1 A2 A3 Drive(cityA,cityB) Q At(truck,B) An executable plan must provide -- the actions that need to be executed -- the start times for each of the actions  Or a set of simple temporal constraints on the set of actions (S.T.C. are generalization of partial orders) E.g. A1—[4,5]  A2 (means 4 <= ST(A2) – ST(A1) <= 5 ) Plan views: Pert and Gantt charts GANTT Chart is what is shown on the right PERT shows the Causal links

14

15 Problem Representation  Achievement Goals are specified as a list where pi needs to hold by time t i  ti is the deadline by which G must hold. It can be metric time (e.g. make clear(b) true by 2pm.)  If ti is omitted we will assume that G is a non-deadline goal (must be true by the time the plan is done.  “Persist Goals” are specified as a condition and an interval over which it must hold  A persist goal may be supported by different actions for the different parts of the duration ( “goal reduction” a la ZENO )  E.g. striking multiple matches to have light over a duration

16 Plan Quality Measures  Makespan: Clock time for the execution of the plan (more concurrency  lower makespan)  Slack: The difference between the deadline for a goal and the time by which the plan achieves it  Tardiness is negative slack  Optimize max/min/average slack/tardiness measures  Cost: Sum of costs of all the actions  Can be split into multiple dimensions, one corresponding to each resource A1 A2 A3 Drive(cityA,cityB) Q At(truck,B)

17 Concurrency  Two actions are concurrent if their execution durations overlap in time  A plan is concurrent if it has concurrently executing actions  If make-span of a plan is less than the sum of the durations of the actions in the plan, then the plan has concurrency  A problem requires concurrency if every solution plan for the problem is concurrent  Note that a problem has sequential solutions but for optimality reasons it may have to go for concurrent solutions  A domain requires concurrency if any of its problems requires concurrency  One distinguishing feature of temporal planning domains is that they may have problems that require concurrency.  Interesting Factoid: Several of the planners that won the temporal planning competition could not actually solve problems requiring concurrency  Another interesting factoid: Most of the bench-mark domains actually didn’t have problems that required concurrency [Cushing et. al. IJCAI-07; ICAPS-07]

18 Looking at STRIPS Actions from PDDL2.1 Vantage Point  How best to view non-durative actions?  Instantaneous  Makes it hard to provide physical semantics (no change is instantaneous)  epsilon duration with only Overall preconditions and At-end post-conditions  We can show that domains with this type of actions can never have problems that require concurrency

19 TGP-style durative actions  A PDDL-2.1 action is a TGP-style durative action if  All preconditions are “Overall” preconditions  All effects are “at-end” effects  It can be shown that domains in which all actions areTGP-style will not require concurrency  Concurrency may still be needed for make-span optimization

20 Temporal Gap  A PDDL-2.1 style action is said to have temporal gap if there is no single time-point in the action where all the preconditions and effects of the actions must hold  Epsilon duration STRIPS actions have no temporal gap  TGP-style actions have no temporal gap  All the preconditions and effects must hold together at the end point of the action  If none of the actions in a domain have temporal gap, then that domain cannot have problems with required concurrency  “Duration” is like a cost measure

21 Add…  The issue of time—dense vs. integer  Rintanen’s complexity issue—R.C. with the same action..  Non-RC plans can be compiled 1-1  A huge modeling jump

22 Ended here..

23 Some Brand Names  Planners that can handle similar types of temporal and resource constraints:  TLPlan, HSTS, IxTexT, Zeno, SAPA, LPG  TlPlan, SAPA are progression-based planners  HSTS,IxTET,Zeno are partial-order-based planners  TlPlan,HSTS are domain-customized planners; the rest are domain independent  Planners that can handle a subset of constraints:  Only temporal: TGP, TPG, LPGP  Only resources: LPSAT, GRT-R, Kautz-Walser, Metric-FF  Subset of temporal and resource constraints: TP4, Resource-IPP  LPGP and LPSAT are “loosely-coupled” systems. LPSAT connects SAT and LP solvers; LPGP connects Graphplan and LPsolver  Issues of how “tight” is the loose-connection.  TGP,TPG,LPGP are Graphplan-based  LPSAT is based on SAT encodings being sent to LP solvers  Kautz-Walser is based solely on LP encodings

24 State of the Art (as of IPC2002) (revised for IPC 2004)  At IPC 2002; PDDL 2.1 standard had three levels  Level 1: STRIPS/ADL  Level 2: +Durative Actions  FF, LPG, SAPA, SGPlan (extends LPG)  Level 3: +Numeric quantities discrete change  Sapa, LPG, SGPlan (extends LPG)  Level 4: +Continuous change  None at IPC  Some planners can handle it “in theory” but none are scalable

25 Approaches for MTP  In theory, pretty much every one of the approaches we saw for classical planning can be (and have been) extended to MTP (with varying degrees of scalability)  There are some interesting tradeoffs  PO planners are easiest to extend to support the concurrency needed for durative actions  Have harder time handling resources (because resource consumption depends on exactly what actions occurred before this time point)  Progression planners easiest to extend to support resource consuming actions  But harder time handling concurrency (need to consider “advancing clock” as a separate option in addition to applying one of the actions)

26 Our Road Map  Will focus on conjunctive planning approaches— with special attention to Sapa  action models  Using PDDL2.1 standard  how to model the search  Progression; Regression; PO planning  how to extract good heuristics Done

27 Action Representation Flying (in-city ?airplane ?city1) (fuel ?airplane) > 0  (in-city ?airplane ?city1) (in-city ?airplane ?city2) consume (fuel ?airplane) Durative with E A = S A + D A Instantaneous effects e at time t e = S A + d, 0  d  D A Preconditions need to be true at the starting point, and protected during a period of time d, 0  d  D A Action can consume or produce continuous amount of some resource Action Conflicts: Consuming the same resource One action’s effect conflicting with other’s precondition or effect

28 Search Methods and Heuristics  Progression: Sapa (TLPlan; FF)  Regression: TP4  Partial order: Zeno (IxTET)

29 Reading List  (3/27)Papers on Metric Temporal Planning  Paper on PDDL-2.1 standard (read up to--not including- -section 6) Paper on PDDL-2.1 standard (read up to--not including- -section 6)  Paper on SAPA Paper on SAPA  Paper on Temporal TLPlan (see Section 3 for a slightly longer description of the progression search used in SAPA). (regression search for Temporal Planning Paper on Temporal TLPlan (see Section 3 for a slightly longer description of the progression search used in SAPA). (regression search for Temporal Planning  Paper on TP4 (regression search for Temporal Planning Paper on TP4 (regression search for Temporal Planning  Paper on Zeno (Plan-space search for Temporal Planning) Paper on Zeno (Plan-space search for Temporal Planning)

30 Digression: Concurrent vs. Parallel plans  The main difference with temporal planning is that we need to produce concurrent plans  In the context of classical planning, concurrent planners are akin to parallel plans (aka Graphplan)  This analogy is not complete of course. For every solvable problem in classical planning, there is guaranteed to be a sequential plan. This guarantee does not hold for temporal planning (which means we have to search in the space of concurrent plans)  Progression planners that we have seen until now produce sequential plans (FF does not produce parallel plans!)  FF is still complete because in classical planning, there is always a sequential plan for every problem  So, we can start by asking what we need to do to make progression produce parallel plans.

31 Digression: How to produce parallel plans with progression?  The naïve idea is to project over subsets of non-interfering actions (rather than single actions).  Problem: Exponential branching factor  A better idea: Consider “fattening” as well as “lengthening” the current partial plan as two options.  We start by representing the state of a partial plan prefix as [S, {A1…Ak}] where S is the current state, and {A1..Ak} are the mutually non-interfering actions that we have already committed to applying at S.  Notice that this is just a generalization of the normal progression state, in which the action set {A1..Ak} will be a singleton  Given a state [S,{A1..Ak}] to expand, we have (backtrackable) choices:  Fatten: Consider applying another action B in state S [One branch for each possible action B]  For this to be feasible, B should be applicable in Si and B should not be interfering with A1..Ak. The resulting state will be {S; {A1…Ak}}  Lengthen: Consider applying an action C in the state S’ which is obtained by applying actions {A1…Ak} in S [One branch for each applicable action]  For this to be feasible, C should be applicable in S’. The resulting state is {S’, {C}}  Notice that  Fattening is only done at the current state (once lengthening is done, the current state changes. So any new fattening will be done at the new state.  Normal progression always selects “Lengthen”. The addition needed to support parallel plans is the “Fatten” branch.

32 Digression: Generating concurrent plans is similar to generating parallel plans…  To generate concurrent plans using progression, we start with the idea of generating parallel plans with progression  For parallel plans, the “state of the partial plan” is represented by [S, {A1..Ak}]  For temporal concurrent plans, we need to generalize this to consider the fact that 1.Each action may have different duration 2.Actions may have effects that are realized at different time points in the future 1.This means that some actions that we have committed to applying at previous states may wind up posting their effects now.  The solution is to start thinking in terms of “current time stamp”, and information about the set of durative actions that we have committed to apply whose effects have not yet been realized.  We can either add additional non-interfering actions at the current time-stamp  OR advance the timestamp (to the nearest future time where new effects of already committed actions can be realized).

33 State-Space Search: Search is through time-stamped states Search states should have information about -- what conditions hold at the current time slice (P,M below) -- what actions have we already committed to put into the plan ( ,Q below) S=(P,M, ,Q,t) Set of predicates pi and the time of their last achievement t i < t. Set of functions represent resource values. Set of protected persistent conditions (could be binary or resource conds). Event queue (contains resource as well As binary fluent events). Time stamp of S. In the initial state, P,M, non-empty Q non-empty if we have exogenous events

34 Let current state S be P:{have_light@0; at_steps@0}; Q:{~have_light@15} t: 0 (presumably after doing the light-candle action) Applying cross_cellar to this state gives S’= P:{have_light@0; crossing@0};  :{have_light, } Q:{at_fuse-box@10;~have_light@15} t: 0 Light-match Cross-cellar 15 10 Time-stamp

35 “Advancing” the clock as a device for concurrency control  To support concurrency, we need to consider advancing the clock  How far to advance the clock?  One shortcut is to advance the clock to the time of the next earliest event event in the event queue; since this is the least advance needed to make changes to P and M of S.  At this point, all the events happening at that time point are transferred from Q to P and M (to signify that they have happened)  This  This strategy will find “a” plan for every problem—but will have the effect of enforcing concurrency by putting the concurrent actions to “align on the left end”  In the candle/cellar example, we will find plans where the crossing cellar action starts right when the light-match action starts  If we need slack in the start times, we will have to post-process the plan  If we want plans with arbitrary slacks on start-times to appears in the search space, we will have to consider advancing the clock by arbitrary amounts (even if it changes nothing in the state other than the clock time itself). Light-match Cross-cellar ~have-light 15 10 In the cellar plan above, the clock, If advanced, will be advanced to 15, Where an event (~have-light will occur) This means cross-cellar can either be done At 0 or 15 (and the latter makes no sense) Cross-cellar

36 Search Algorithm (cont.)  Goal Satisfaction : S=(P,M, ,Q,t)  G if   G either:    P, t j < t i and no event in Q deletes p i.   e  Q that adds p i at time t e < t i.  Action Application : Action A is applicable in S if:  All instantaneous preconditions of A are satisfied by P and M.  A’s effects do not interfere with  and Q.  No event in Q interferes with persistent preconditions of A.  A does not lead to concurrent resource change  When A is applied to S:  P is updated according to A’s instantaneous effects.  Persistent preconditions of A are put in   Delayed effects of A are put in Q. S=(P,M, ,Q,t) [TLplan; Sapa; 2001]

37


Download ppt "Metric/Temporal Planning. Metric Temporal Planning  MTP Adds time and resources to planning Special cases: TP: Temporal planning RP: Resource Planning."

Similar presentations


Ads by Google