Presentation is loading. Please wait.

Presentation is loading. Please wait.

RePOP: Reviving Partial Order Planning

Similar presentations


Presentation on theme: "RePOP: Reviving Partial Order Planning"— Presentation transcript:

1 RePOP: Reviving Partial Order Planning
XuanLong Nguyen & Subbarao Kambhampati Yochan Group Arizona State University

2 return with Re(vived)POP
The good times return with Re(vived)POP Then it was cruelly UnPOPped In a way this paper describes a morality play.. A triumph of good over not-so-good. In the beginning, all planners were partial order planners. They came in great many varieties. They weren’t exactly nimble on their feet, but they were nice. You could write papers by making one solve a slightly larger blocks world problem. In short, there was peace. Then, cruel fate, bearing an uncanny resemblence to Drew McDermott intervened. The peaceful era of POP was cruelly ripped open by heuristic planners. These planners were from the other side of the tracks. They used--aaugh… it pains me to say this.. STATE SPACE SEARCH--something that everyone thought was put to rest by the sacerdotal piety of Earl. There was an implication that only State Space planners can be controlled with heuristics. The days of POP, they said, were over. NOT SO FAST… say we. We cleverly copt the same heuristics to repair, revive and REPOP the partial order planners. In the beginning it was all POP.

3 A recent (turbulent) history of planning
UCPOP, Zeno [Penberthy &Weld] IxTeT [Ghallab et al] The whole world believed in POP and was happy to stack 6 blocks! UCPOP 1995 Advent of CSP style compilation approach: Graphplan [Blum & Furst] SATPLAN [Kautz & Selman] Use of reachability analysis and Disjunctive constraints Domination of heuristic state search approach: HSP/R [Bonet & Geffner] UNPOP [McDermott]: POP is dead! Importance of good Domain-independent heuristics 1997 UNPOP 2000 - Hoffman’s FF – a state search planner won the AIPS-00 competition! … but NASA’s highly publicized RAX still a POP dinosaur! POP believed to be good framework to handle temporal and resource planning [Smith et al, 2000] RePOP

4 RePOP: A revival for partial order planning
Outline RePOP: A revival for partial order planning To show that POP can be made very efficient by exploiting the same ideas that scaled up state search and Graphplan planners Effective heuristic search control Use of reachability analysis Handling of disjunctive constraints RePOP, implemented on top of UCPOP Dramatically better than all known partial-order planners Outperforms Graphplan and competitive with state search planners in many (parallel) domains

5 Partial plan representation
POP background Partial plan representation P = (A,O,L,OC,UL) A: set of action steps in the plan S0 ,S1 ,S2 …,Sinf O: set of action ordering Si < Sj ,… L: set of causal links OC: set of open conditions (subgoals remain to be satisfied) UL: set of unsafe links where p is deleted by some action Sk I={q1 ,q2 } G={g1 ,g2 } p q1 S1 S3 g1 p Si Sj g2 S0 Sinf g2 p Si Sj oc1 oc2 S2 ~p Flaw: Open condition OR unsafe link Solution plan: A partial plan with no remaining flaw Every open condition must be satisfied by some action No unsafe links should exist (i.e. the plan is consistent)

6 Algorithm POP background g1 g2 S0 Sinf 1. Let P be an initial plan p
2. Flaw Selection: Choose a flaw f (either open condition or unsafe link) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P else go to 2. 2. Plan refinement (flaw selection and resolution): p q1 S1 S3 g1 S0 Sinf g2 g2 oc1 oc2 S2 ~p Choice points Flaw selection (open condition? unsafe link?) Flaw resolution (how to select (rank) partial plan?) Action selection (backtrack point) Unsafe link selection (backtrack point)

7 Our approach (main ideas)
State-space idea of distance heuristic 1. Ranking partial plans: use an effective distance-based heuristic estimator . 2. Exploit reachability analysis: use invariants to discover implicit conflicts in the plan. 3. Unsafe links are resolved by posting disjunctive ordering constraints into the partial plan: avoid unnecessary and exponential multiplication of failures due to promotion/demotion splitting CSP ideas of consistency enforcement

8 1. Ranking partial plans using distance-based heuristic
O’ Ranking Function: f(P) = g(P) + w h(P) g(P): number of actions in P h(P): estimate of number of new actions needed to refine P to become a solution plan w: increase the greediness of the heuristic search 2. Estimating h(P) h(P)  |O’|  Estimating |O’| Difficulty: How to account for positive and negative - Interactions among actions in O’ - Interactions among actions in P - Interactions between O’ and P S3 q1 S1 p g1 S0 S5 g2 Sinf g2 q r S4 S2 ~p The basic idea is to relax certain constraints, and keep only the most important constraints in this estimation. Since h(P) is directly connected to the set of open conditions, and less so to the set of unsafe link, we would relax the set of unsafe link. h(P)  |O’| = 2

9 Estimating h(P) P O’ S3 q1 p S1 g1 S0 S5 g2 Sinf g2 q S4 S2 r ~p
Assumption: Negative effects of actions are relaxed (which are to be dealt with later in unsafe link set) P has no unsafe link flaws no negative interactions among actions in P no negative interactions between O’ and P |O’| ~ cost(S) needed to achieve the set of open conditions S from the initial state Any state-space distance heuristic can be adapted Informedness of heuristic estimate can be improved by using weaker relaxation assumption O’ S3 q1 S1 p g1 S0 S5 g2 Sinf g2 q r S4 S2 ~p The basic idea is to relax certain constraints, and keep only the most important constraints in this estimation. Since h(P) is directly connected to the set of open conditions, and less so to the set of unsafe link, we would relax the set of unsafe link. Open condition set S={p,q,r,..}

10 Distance-based heuristic estimate (adapted from state-space heuristics extracted from planning graphs [Nguyen & Kambhampati 2000], [Hoffman 2000],…) p a 1 2 3 S S+Prec(a)-Eff(a) Estimate h(P) = cost(S) 1. Build a planning graph PG from the initial state. 2. Cost(S) := 0 if all subgoals in S are in level 0. 3. Let p be a subgoal in S that appears last in PG. 4. Pick an action a in the graph that first achieves p 5. Update cost(S) := cost(a) + cost(S+Prec(a) – Eff(a)) where cost(a) = 0 if a P, and 1 otherwise 6. Replace S = S+Prec(a) – Eff(a), goto 2 You would need to decide what you plan to say here….***

11 2. Handling unsafe link flaws
p Si Sj p 1. For each unsafe link threatened by another step Sk: Add disjunctive constraint to O Sk < Si V Si < Sj 2. Whenever a new ordering constraint is introduced to O, perform the constraint propagations: S1 < S2 V S3 < S4 ^ S4< S3  S1 < S2 S1 < S2 ^ S2 < S3  S1 < S3 S1 < S2 ^ S2 < S1  False Si Sj ~p q Prec(a) Sk Avoid the unnecessary exponential multiplication of failing partial plans

12 3. Detecting indirect conflicts using reachability analysis
Reachability analysis to detect inconsistency on(a,b) and clear(b) How to get state information in a partial plan? 3. Cutset: Set of literals that must be true at some point during execution of plan For each action a, pre-C(Sk) = Prec(Sk) U {p | is a link and Si < Sk < Sj } post-C(Sk) = Eff(Sk) U {p | 4. If there exists a cutset that violates of an invariant the partial plan is invalid and should be pruned p Si Sj q Sm Sn Prec(Sk) Sk Eff(Sk) p Si Sj Prec(Sk) + p + q Eff(Sk) + p + q p Si Sj Disadvantage: Inconsistency checking is passive and maybe expensive

13 Detecting indirect conflicts using reachability analysis
Generalizing unsafe link: Sk threatens iff p is mutually exclusive (mutex) with either Prec(Sk) or Eff(Sk) Unsafe link is resolved by posting disjunctive constraints (as before) Sk < Si V Si < Sj p p Si Sj Si Sj q Sm Sn Prec(Sk) Sk Eff(Sk) Detects indirect conflicts early Derives more disjunctive constraints to be propagated

14 Experiments on RePOP RePOP is implemented on top of UCPOP planner using the three ideas presented Written in Lisp, runs on Linux, 500MHz, 250MB RePOP deals with set of totally instantiated actions thus avoids binding constraints Compared RePOP against UCPOP, Graphplan and AltAlt in a number of benchmark domains Performance metrics Time Solution quality

15 Comparing planning time (time in seconds)
Repop vs. UCPOP Graphplan AltAlt Comparing planning time (time in seconds) Problem UCPOP RePOP Graphplan AltAlt Gripper-8 - 1.01 66.82 .43 Gripper-10 2.72 47min 1.15 Gripper-20 81.86 15.42 Rocket-a 8.36 75.12 1.02 Rocket-b 8.17 77.48 1.29 Logistics-a 3.16 306.12 1.59 Logistics-b 2.31 262.64 1.18 Logistics-c 22.54 4.52 Logistics-d 91.53 20.62 Bw-large-a 45.78 (5.23) - 14.67 4.12 Bw-large-b (18.86) - 122.56 14.14 Bw-large-c (137.84) - 116.34

16 Comparing planning time (summary)
Repop vs. UCPOP Graphplan AltAlt RePOP is very good in parallel domains (gripper, logistics, rocket, parallel blocks world) Completely dominates UCPOP Outperforms Graphplan in many domains Competitive with AltAlt RePOP still inefficient in serial domains: Travel, Grid, 8-puzzle **Put summary after the table**

17 Some solution quality metrics
Repop vs. UCPOP Graphplan AltAlt 1. Number of actions 2. Makespan: minimum completion time (number of time steps) 3. Flexibility: Average number of actions that do not have ordering constraints with other actions 1 3 Num_act=4 Makespan=2 Flex = 1 2 4 1 3 Num_act=4 Makespan=2 Flex = 2 2 4 1 2 3 4 Num_act=4 Makespan=4 Flex = 0

18 Comparing solution quality
Number of actions/ time steps Flexibility degree Problem RePOP Graphplan AltAlt Gripper-8 21/ 15 23/ 15 21/ 21 .57 .69 Gripper-10 27/ 19 29/ 19 27/ 27 .59 .61 Gripper-20 59/ 39 - 59/ 59 .68 Rocket-a 35/ 16 40/ 7 36/ 36 2.46 7.15 Rocket-b 34/15 30/ 7 34/ 34 7.29 4.80 Logistics-a 52/ 13 80/ 11 64/ 64 20.54 6.58 Logistics-b 42/ 13 79/ 13 53/ 53 20.0 5.34 Logistics-c 50/ 15 70/ 70 16.92 Logistics-d 69/ 33 85/ 85 22.84 Bw-large-a (8/5) - 11/ 4 9/ 9 2.75 2.0 Bw-large-b (11/8) - 18/ 5 11/ 11 3.28 2.67 Bw-large-c (17/ 10) - 19/ 19 5.06

19 Comparing solution quality (summary)
RePOP generates partially ordered plans Number of actions: RePOP typically returns shortest plans Number of time steps (makespan): Graphplan produces optimal number of time steps (strictly when all actions have the same durations) RePOP comes close Flexibility: RePOP typically returns the most flexible plans **Put summary after the table** ***SHOULDN”T RePOP be giving better makespans sometimes--espeically if the actions can have non-uniform durations??

20 Ablation studies CE: Consistency enforcement techniques (reachability analysis and disjunctive constraint handling HP: Distance-based heuristic Problem UCPOP + CE + HP +CE+HP (RePOP) Gripper-8 * 6557/ 3881 1299/ 698 Gripper-10 11407/ 6642 2215/ 1175 Gripper-12 17628/ 10147 3380/ 1776 Gripper-20 11097/ 5675 Rocket-a 30110/ 17768 7638/ 4261 Rocket-b 85316/ 51540 28282/ 16324 Logistics-a 411/ 191 847/ 436 Logistics-b 920/ 436 542/ 271 Logistics-c 4939/ 2468 7424/ 4796 Logistics-d 16572/ 10512 ***NEED SOME LESSONS OF ABLATION STUDIES--WRITE THEM NEXT TO THE TABLE …

21 Conclusion Developed effective techniques for improving partial-order planners: Ranking partial plan heuristics, Disjunctive representation for unsafe links, Use of reachability analysis Presented and evaluated RePOP Brings POP to the realm of effective planning algorithms Can now exploit the flexibility of POP without too much efficiency penalty Moral? State-space vs. CSP vs. POP

22 Future Work Improve the efficiency of RePOP in serial domains
Serial domains may be an inherent weakness of POP Thankfully, Real-world domains tend to admit partially ordered plans (or there wouldn’t be any scheduling separate from planning!) Devise effective admissible heuristics for POP Extend RePOP to deal with partially instantiated actions time and resource constraints ReBuridan? ReZeno? ReIxTeT? ****THE slide’s organization can be changed… Future work should talk about near term first and then the longer term.. It seems two and three are more near term than the extensions…no? Also, I am not sure what you are trying to say about three--devise effective admissible heuristics for POP… Is the key word here admissibility? If so, do we care? It seems something about handling negative interactions better may be an interesting thing to talk about.


Download ppt "RePOP: Reviving Partial Order Planning"

Similar presentations


Ads by Google