Forward-Chaining Partial-Order Planning Amanda Coles, Andrew Coles, Maria Fox and Derek Long (to appear, ICAPS 2010)

Summary Forward-chaining planning eliminates the threat resolution of POP, at the price of over- commitment. Issues arise in temporal planning, due to needless ordering constraints leading to backtracking. Can modify a forward-chaining approach to construct a partial-order, avoiding this. Further, can modify a TRPG heuristic to encourage search to find lower makespan plans. Implemented and evaluated in the planner POPF

Overview (Temporal) Forward-Chaining Planning Issues with using a Total Order Reducing Commitment Heuristic Guidance for Lower Makespan Plans EvaluationConclusions

Forward Chaining Temporal Planning A state S is a tuple of: Propositional Facts Propositional Facts Values of task variables Values of task variables A Queue of actions that have not yet finished A Queue of actions that have not yet finished The Plan to reach S The Plan to reach S The Constraints on the steps in P The Constraints on the steps in P The plan consists of the starts and ends of actions: A and A denote the start/end of A, resp. A and A denote the start/end of A, resp.

light_match match1 light m1 ¬light m1 mend_fuse fuse1 match1 0: light_match_start match1 1: mend_fuse_start fuse1 match1 2: mend_fuse_end fuse1 match1 3: light_match_end match1 lms mfs1mfe1 lme 8.0 -8.0 -0.01 - 5.0 5.0 -0.01 Epsilon separation (0.01) Simple Example

Issues with Using a Total Order To resolve threats, F.C. planning uses a total order. When applying an action A: A cannot violate preconditions of earlier actions, as it comes after them (demotion); A cannot violate preconditions of earlier actions, as it comes after them (demotion); Subsequent actions cannot delete its preconditions, as A comes sooner (promotion) Subsequent actions cannot delete its preconditions, as A comes sooner (promotion) The drawback is that needless ordering constraints are added: If A does not interfere with the preceding step, it still must come after it. If A does not interfere with the preceding step, it still must come after it. Motivates partial-order lifting, but this first needs a solution to be found.

Total Orders of Start/End Actions Two actions, A and B: B is longer than A; B is longer than A; No interaction between A and B ; No interaction between A and B ; But, B must precede A But, B must precede A The planner chooses a (partial) plan: A B B

A B B A A -0.01 2 -2 5 -5 -0.01 A was added to the plan before B, theyBecause A was added to the plan before B, they are ordered as shown (in a total-order). are ordered as shown (in a total-order). But, Awill not be applicable until after BBut, Awill not be applicable until after B The planner will have to backtrack, over all the intermediateThe planner will have to backtrack, over all the intermediate decisions, and add B to the plan earlier than A decisions, and add B to the plan earlier than A.

Reducing Commitment Record additional information at each state concerning which steps achieve / delete / depend on each fact. Use this information to commit to fewer ordering constraints Still resolve threats based on the intuition of forward-chaining expansion: new actions cannot threaten the preconditions of earlier actions.

Extending the State: Propositional To capture ordering information we add: F +, F -, where F + (p) (F - (p)) is the index of the of the step that most recently added (deleted) p FP, where FP(p) is a set of pairs : denotes that step i has an instantaneous condition on p ( at start or at end ) denotes that step i has an instantaneous condition on p ( at start or at end ) denotes that step i marks the end of an action with an over all condition on p denotes that step i marks the end of an action with an over all condition on p

Starting an Action A at Step i For each at start condition p: t(F + (p)) + ε t(i) For each at start del. effect p, assign F - (p) = i, t(F + (p)) + ε t(i), and in FP(P), t(j) + d t(i) For each at start add effect p, assign F + (p) = i, and if F - (p) i, t(F - (p)) + ε t(i) For each over all condition p: If F + (p) i, t(F + (p)) t(i) (To apply the end of an action: similar process, but without over all conditions) A

A B B A A -0.01 2 -2 5 -5 -0.01 0.00: (action B) [5.00] 3.01: (action A) [2.00]

Extending the State: Numeric For numbers we are a little more strict: V eff, where V eff (v) is the step of the action to most recently have an effect on v VP, where VP(v) contains steps that depend on the value of v, each step i such that: i has a precondition on v, or is the start of an action whose duration constraint contains v; or, i has a precondition on v, or is the start of an action whose duration constraint contains v; or, i has an effect that depends on v i has an effect that depends on v VI, where VI(v) is a set of pairs (s,e), marking the start/end indices of actions in the event queue (Q) with an over all condition depending on v (Also, V cts to handle linear continuous numeric change – see paper for details.)

Starting an Action A at Step i: For each variable v relevant to at start conditions, effects, or the actions duration: t(V eff (v)) + ε t(i) For each v on which A has an at start eff, apply the effect to V, and: (s,e) in VI(v), t(s) + ε t(i) and t(i) + ε t(e) For each variable v relevant to an over all, add (i,j) to VI(v), and if was not relevant to the start of A: t(V eff (v)) + ε t(i) A

Heuristic Guidance Have seen how the search space can be modified to reduce excessive ordering constraints; There is still no pressure to prefer choices that lead to a partial-order with a lower makespan Could use partial-order lifting a posteriori for similar quality results? Could use partial-order lifting a posteriori for similar quality results? Given we know the makespan implications of action choices, how can we factor this into the decision making during search?

Revisiting the Temporal RPG The Temporal RPG consists of time-stamped fact and action layers. To evaluate a state S: Fact layer f=0.0 contains the facts in S; Fact layer f=0.0 contains the facts in S; Action layer a=0.00 contains actions whose preconditions are satisfied in f=0.0; Action layer a=0.00 contains actions whose preconditions are satisfied in f=0.0; Effects of actions appear in the next layer; the end of an action A is delayed until dur(A) after A start first appears. Effects of actions appear in the next layer; the end of an action A is delayed until dur(A) after A start first appears. What about the extra information we now have in S?

Bounding Preconditions and Effects on Facts When adding actions to the partial order, for a proposition p: Any action requiring p to satisfy a precondition will need to come after t(F + (p)) and t(F - (p)) Any action requiring p to satisfy a precondition will need to come after t(F + (p)) and t(F - (p)) Any action with an add (delete) effect on p will need to come after t(F - (p)) ( t(F + (p)) resp.) Any action with an add (delete) effect on p will need to come after t(F - (p)) ( t(F + (p)) resp.) From checking temporal constraints, we have a lower-bound on each step, t min (i) Thus, the earliest point we can use p is: l(p) = max { t min (F + (p)), t min (F - (p)) + ε }

Bounding (continued) Similarly, for each numeric precondition/effect referring to a variable set vars, it cannot be used until: L(vars) = max v in vars t min (v eff (v)) With these bounds, for any state S, we can build a TRPG starting at time zero: Delay fact p until layer L(p) Delay fact p until layer L(p) Delay numeric preconditions/effects until L(vars) for their respective variable sets Delay numeric preconditions/effects until L(vars) for their respective variable sets Then, actions which do not interfere with existing choices will appear sooner in the TRPG.

Evaluation Planner POPF, based on the code for COLIN (IJCAI09) First test: Control: run COLIN, then apply partial-order lifting to the solution Control: run COLIN, then apply partial-order lifting to the solution POPF, but using the original heuristic from COLIN. POPF, but using the original heuristic from COLIN. Second test, also considering domains with deadlines: COLIN then partial-order lifter COLIN then partial-order lifter POPF, new heuristic. POPF, new heuristic.

Test 1: Time Taken

Test 1: Makespan

Test 2: Time Taken

Test 2: Makespan

Test 2: Time Taken (Deadlines)

Conclusions Have shown how a partial-order can be expanded in a forwards direction; Adapting the heuristic allows one to trade time performance for a reduction in makespan; In domains with deadlines, performance is: substantially improved (fivefold improvement in coverage in the Satellite variants). In domains with deadlines, performance is: substantially improved (fivefold improvement in coverage in the Satellite variants). In the paper: approach also works with domains containing linear-continuous change

Forward-Chaining Partial-Order Planning Amanda Coles, Andrew Coles, Maria Fox and Derek Long (to appear, ICAPS 2010)

Similar presentations

Presentation on theme: "Forward-Chaining Partial-Order Planning Amanda Coles, Andrew Coles, Maria Fox and Derek Long (to appear, ICAPS 2010)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Forward-Chaining Partial-Order Planning Amanda Coles, Andrew Coles, Maria Fox and Derek Long (to appear, ICAPS 2010)

Similar presentations

Presentation on theme: "Forward-Chaining Partial-Order Planning Amanda Coles, Andrew Coles, Maria Fox and Derek Long (to appear, ICAPS 2010)"— Presentation transcript:

Similar presentations

About project

Feedback