Download presentation
Presentation is loading. Please wait.
2
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati 11 th Feb
3
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he
4
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he St-A-B St-B-A Ptdn-A Ptdn-B Pick-A onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he on-A-B on-B-A Pick-B
5
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Using the planning graph to estimate the cost of single literals: 1. We can say that the cost of a single literal is the index of the first proposition level in which it appears. --If the literal does not appear in any of the levels in the currently expanded planning graph, then the cost of that literal is: -- l+1 if the graph has been expanded to l levels, but has not yet leveled off -- Infinity, if the graph has been expanded (basically, the literal cannot be achieved from the current initial state) Examples: h({~he}) = 1 h ({On(A,B)}) = 2 h({he})= 0
6
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Estimating the cost of a set of literals (e.g. a state in regression search) Idea 0. [Max Heuristic] H max ({p,q,r..}) = max{h(p),h(q),….} Admissible, but very weak in practice Idea 2. [Sum Heuristic] Make subgoal independence assumption h ind ({p,q,r,...}) = h(p)+h(q)+h(r)+… Much better than set-difference heuristic in practice. --Ignores +ve interactions h({~he,h-A}) = h(~he) + h(h-A) = 1+1=2 But, we can achieve both the literals with just a single action, Pickup(A). So, the real cost is 1 --Ignores -ve interactions h({~cl(B),he}) = 1+0 = 1 But, there is really no plan that can achieve these two literals in this problem So, the real cost is infinity!
7
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Positive Interactions We can do a better job of accounting for +ve interactions in two ways: if we define the cost of a set of literals in terms of the level hlev({p,q,r})= The index of the first level of the PG where p,q,r appear together so, h({~he,h-A}) = 1 Compute the length of a “relaxed plan” to supporting all the literals in the set S, and use it as the heuristic (**) hrelax hrelax(S) greater than or equal to hlev(S) Interestingly, hlev is an admissible heuristic, even though hind is not! (Prove) How about hrelax?
8
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Finding a “relaxed plan” G Suppose you want to find a relaxed plan for supporting literals g1…gm on a k-length PG. You do it this way: –Start at kth level. Pick one action for supporting each gi (the actions don’t have to be distinct—one can support more than one goal). Let the actions chosen be {a1…aj} –Take the union of preconditions of a1…aj. Let these be the set p1…pv. –Repeat the steps 1 and 2 for p1…pv—continue until you reach init prop list. G The plan is called “relaxed” because you are assuming that sets of actions can be done together without negative interactions. G The optimal relaxed plan is the shortest relaxed plan –Finding optimal relaxed plan is NP-complete G Greedy strategies can find close-to-shortest relaxed plan G Length of relaxed plan for supporting S is often longer than the level of S because the former counts actions separately, while the later only considers levels (with potentially more than one action being present at each level) –Of course, if we say that no more than one action can be done per level, then relaxed plan length will not be any higher than level. »But doing this basically involves putting mutex relations between actions
9
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Use of PG in Progression vs Regression G Progression –Need to compute a PG for each child state »As many PGs as there are leaf nodes! »Lot higher cost for heuristic computation l Can try exploiting overlap between different PGs –However, the states in progression are consistent.. »So, handling negative interactions is not that important »Overall, the PG gives a better guidance G Regression –Need to compute PG only once for the given initial state. »Much lower cost in computing the heuristic –However states in regression are “partial states” and can thus be inconsistent »So, taking negative interactions into account using mutex is important l Costlier PG construction »Overall, PG’s guidance is not as good unless higher order mutexes are also taken into account Historically, the heuristic was first used with progression planners. Then they used it with regression planners. Then they found progression planners do better. Then they found that combining them is even better. Remember the Altimeter metaphor..
10
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Progression Regression A PG based heuristic can give two things: 1. Goal-directedness 2. Consistency. Progression needs 1 more --So can get by without mutex propagation Regression needs 2 more. --So may need even higher consistency information than is provided by normal PG.
11
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Negative Interactions G To better account for -ve interactions, we need to start looking into feasibility of subsets of literals actually being true together in a proposition level. G Specifically,in each proposition level, we want to mark not just which individual literals are feasible, –but also which pairs, which triples, which quadruples, and which n- tuples are feasible. (It is quite possible that two literals are independently feasible in level k, but not feasible together in that level) G --The idea then is to say that the cost of a set of S literals is the index of the first level of the planning graph, where no subset of S is marked infeasible G G --The full scale mark-up is very costly, and makes the cost of planning graph construction equal the cost of enumerating the full progres sion search tree. G -- Since we only want estimates, it is okay if talk of feasibility of upto k-tuples G -- For the special case of feasibility of k=2 (2-sized subsets), there are some very efficient marking and propagation procedures. G This is the idea of marking and propagating mutual exclusion relations.
12
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Rule 1. Two actions a1 and a2 are mutex if (a)both of the actions are non-noop actions or (b) a1 is any action supporting P, and a2 either needs ~P, or gives ~P. (c) some precondition of a1 is marked mutex with some precondition of a2 Rule 2. Two propositions P1 and P2 are marked mutex if all actions supporting P1 are pair-wise mutex with all actions supporting P2.
13
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he
14
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati onT-A onT-B cl-A cl-B he Pick-A Pick-B onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he St-A-B St-B-A Ptdn-A Ptdn-B Pick-A onT-A onT-B cl-A cl-B he h-A h-B ~cl-A ~cl-B ~he on-A-B on-B-A Pick-B
15
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Here is how it goes. We know that at every time step we are really only going to do one non-no-op action. So, at the first level either pickup-A, or pickup-B or pickup-C are done. If one of them is done, the others can’t be. So, we put red-arrows to signify that each pair of actions are mutually exclusive. Now, we can PROPAGATE the mutex relations to the proposition levels. Rule 1. Two actions a1 and a2 are mutex if (a)both of the actions are non-noop actions or (b) a1 is a noop action supporting P, and a2 either needs ~P, or gives ~P. (c) some precondition of a1 is marked mutex with some precondition of a2 By this rule Pick-A is mutex with Pick-B. Similarly, the noop action he is mutex with pick-A. Rule 2. Two propositions P1 and P2 are marked mutex if all actions supporting P1 are pair-wise mutex with all actions supporting P2. By this rule, h-A and h-B are mutex in level 1 since the only action giving h-A is mutex with the only action giving h-B. ~cl(B) and he are mutex in the first level, but are not mutex in the second level (note that ~cl(B) is supported by a noop and stack-a-b (among others) in level 2. he is supported by stack-a-b, noop (among others). At least one action –stack-a-b supporting the first is non-mutex with one action—stack-a-b-- supporting the second.
16
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Some observations about the structure of the PG 1. If an action a is present in level l, it will be present in all subsequent levels. 2. If a literal p is present in level l, it will be present in all subsequent levels. 3. If two literals p,q are not mutex in level l, they will never be mutex in subsequent levels --Mutex relations relax monotonically as we grow PG 1,2,3 imply that a PG can be represented efficiently in a bi-level structure: One level for propositions and one level for actions. For each proposition/action, we just track the first time instant they got into the PG. For mutex relations we track the first time instant they went away. A PG is said to have leveled off if there are no differences between two consecutive proposition levels in terms of propositions or mutex relations. --Even if you grow it further, no more changes can occur..
17
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Level-based heuristics on planning graph with mutex relations h lev ({p 1, …p n })= The index of the first level of the PG where p 1, …p n appear together and no pair of them are marked mutex. (If there is no such level, then h lev is set to l+1 if the PG is expanded to l levels, and to infinity, if it has been expanded until it leveled off) We now modify the h lev heuristic as follows This heuristic is admissible. With this heuristic, we have a much better handle on both +ve and -ve interactions. In our example, this heuristic gives the following reasonable costs: h({~he, cl-A}) = 1 h({~cl-B,he}) = 2 h({he, h-A}) = infinity (because they will be marked mutex even in the final level of the leveled PG) Works very well in practice
18
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati But Level Doesn’t always win over SUM Solution length/ time in sec. 256MB, 500MHz
19
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati 13 th Feb
20
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Qns on PG? G Consider a set of subgoals {p,q,r,s} –If the set appears at level 12 without any pair being mutex, is there guaranteed to be a plan to achive {p,q,r,s} using a 12-step plan? –If {p,q} appear in level 12 without being mutex, is there a guaranteed 12-step plan to achieve {p,q}? –If {p} appears in level 12 without being mutex, is there a guarnateed 12-step plan to achieve {p} with ? PG does approximate reachability analysis
21
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Massaging good (inadmissible) heuristics from h sum G H sum tends to do better sometimes than H lev (and vice versa) G Here are some heuristics that do better than both (can be thought of as adjusting the sum heuristic) –Combo (add ‘em both! H sum +H lev ) –H sum + a negative interaction penalty »-Degree of interaction in terms of l (p,q) = lev({p,q}) - max{lev(p), lev(q)} (p,q) = if p and q are static mutex 0 < (p,q) < if p and q are level specific mutex l (p,q) = 0 otherwise (p and q are non-interacting) –Relaxed plan length + a negative interaction penality G So, start with H sum and
22
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Adjusted Sum Heuristic G Adjust the Sum heuristic to take positive and negative interactions HAdjSum2M(S) = length(RelaxedPlan(S)) + max p,q S p,q) G Relaxed Plan computed by ignoring all the mutex relations G Second component is an approximation of a penalty induced by ignoring the negative interactions G Degree of interaction – (p,q) = lev({p,q}) - max{lev(p), lev(q)} – (p,q) = if p and q are static mutex –0 < (p,q) < if p and q are level specific mutex – (p,q) = 0 otherwise (p and q are non-interacting) [Nguyen & Kambhampati, AAAI 2000] Shown to be quite effective
23
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati But Level Doesn’t always win over SUM Solution length/ time in sec. 256MB, 500MHz
24
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati A family of “Adjusted” heuristics
25
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati AltAlt Uses a hybrid of Level and sum Heuristics --sacrifices admissibility --uses partial PG to keep heuristic cost down
26
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati AltAlt Optimizations G Cost of Computing the Heuristic can be high Bi-level Planning Graph representation Partial expansion of the PG Use PG to prune unpromising choices Select actions in lev(S) vs Levels-off (paction : pruning actions strategy) Branching factor can still be quite high
27
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Bi-Level Planning Graph G Two Layers of Graph divided into ranks –A Facts Array –An Actions Array G Increases Graph Construction efficiency G Reduces Storage requirements – Also quite useful for temporal planning problems… [see TGP] Rank 0 Rank 1 Rank 2 Rank 1 Rank 2 0 At(0,0) 1 Key(0,1) 2 At(0,1) 3 At(1,0) 4 At(1,1) 5 HaveKey 0 At(0,0)_Noop 1 Key(0,1)_Noop 2 Move(0,0,0,1) 3 Move(0,0,1,0) 4 Move(0,0,0,1)_Noop 5 Move(0,0,1,0)_Noop 6 Move(0,1,1,1) 7 Move(1,0,1,1) 8 Pick_Key(0,1) Pointers & Level Information Facts Actions
28
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Partial Expansion G Grow the graph to level k (k < level-off level) –Level(S) if k+1 instead of , if S is not present in the graph without mutex G Can limit the space and time resources expended on computing the planning graph by trading heuristic quality
29
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati PGs for reducing actions G If you just use the action instances at the final action level of a leveled PG, then you are guaranteed to preserve completeness –Reason: Any action that can be done in a state that is even possibly reachable from init state is in that last level –Cuts down branching factor significantly –Sometimes, you take more risky gambles: »If you are considering the goals {p,q,r,s}, just look at the actions that appear in the level preceding the first level where {p,q,r,s} appear for the first time without Mutex.
30
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Limiting Branching Factor using Planning Graphs G PACTION strategy: Pick actions at lev(S) – (instead of at the last level) –Rationale: Lev(S) comprises the most significant actions for achieving state S from the Initial State Levels off A A A A B B C B C D B C D E A B C D E x x x x x x x x x A1 A2 A3 A4 A1 A2 A3 A4 A5 A6 An C,D State S Lev(S) Levels-off Reduces BF (albeit incomplete)
31
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati AIPS-00 Schedule Domain Do PG expansion only upto level(l) where all top level goals come in without being mutex
32
A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Empirical Evaluation: Logistics domain HadjSum2M heuristic Problems and domains from AIPS-2000 Planning Competion (AltAlt approx in top four)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.