EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University.

EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University http://rakaposhi.eas.asu.edu/yochan.html

Motivation G Graphplan has become quite influential »3 of 4 participants at AIPS-99 competition used it –It is worth understanding and improving the algorithm G (Backward) Search of planning Graph is a big bottleneck for Graphplan –Planning Graph is very closely related to Dynamic CSP (which in turn is related to CSP) Exploit the CSP search techniques to improve planning graph.

Overview G Connections between Planning Graph and CSP G Review and critique of inefficiencies of backward search on planning graph G EBL and DDB for improving memoization in Graphplan –The idea –Empirical evaluation »up to 1000X speedup! »Effectiveness on random-restart search »Utility of memos G Augmenting with FC, DVO & Sticky values G Conclusions and Further directions

Constructing Planning Graph I1I1 I2I2 I3I3 X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 G1G1 G2G2 G3G3 G4G4 A1A1 A2A2 A3A3 A4A4 P6P6 P1P1 Graphplan Review

Planning Graph as a Dynamic CSP --Propositions become DCSP variables; Actions are the values; Mutex constraints are normal constraints. Action preconditions are activation constraints Solving DCSP: V := set of initially active variables Loop until all active variables are assigned Assign currently active variables Set V= variables that become active [Mittal & Falkenhainer, 1990]

Converting a Dynamic CSP to a normal CSP --Introduce a new null value into the domains of every variable. Inactive variables have null value. Activation constraints become “Can’t have null value” Useful for interpreting the mutex propagation step as a partial directed 2-consistency enforcement procedure

Backward Search & its Problems P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 I1I1 I2I2 I3I3 X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 G1G1 G2G2 G3G3 G4G4 A1A1 A2A2 A3A3 A4A4 P6P6 P1P1 Graphplan Review A naïve implementation of DCSP search

Explaining Failures with Conflict Sets Conflict set for P 4 = P 4 Whenever P can’t be given a value v because it conflicts with the assignment of Q, add Q to P’s conflict set X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 P2P2 P1P1

X X X P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 A5A5 A6A6 A7A7 A8A8 A9A9 A 10 A 11 DDB & Memoization (EBL) with Conflict Sets When we reach a variable V with conflict set C during backtracking --Skip other values of V if V is not in C (DDB) --Absorb C into conflict set of V if V is in C --Store C as a memo if V is the first variable at this level Conflict set for P 3 = P 3 P2P2 P3P3 --Skip over P 3 when backtracking from P 4 Conflict set for P 4 = P 4 P2P2 P1P1 Conflict set for P 1 = P 4 P2P2 P1P1 P3P3 Conflict set for P 2 = P 4 P2P2 P1P1 Absorb conflict set being passed up P1P1 P2P2 P3P3 P4P4 Store P 1 P 2 P 3 P 4 as a memo

Regressing Conflict Sets P1P1 P2P2 P3P3 P4P4 P1P1 P2P2 P3P3 P4P4 P5P5 P6P6 G1G1 G2G2 G3G3 G4G4 A1A1 A2A2 A3A3 A4A4 P6P6 P1P1 P 1 P 2 P 3 P 4 regresses to G 1 G 2 -P 1 could have been regressed to G 4 but G 1 was assigned earlier --We can skip over G 4 & G 3 (DDB) Regression: What is the minimum set of goals at the previous level, whose chosen action supports generate a sub-goal set that covers the memo --Minimal set --When there is a choice, choose a goal that has been assigned earlier --Supports more DDB

Using EBL Memos If any stored memo is a subset of the current goal set, backtrack immediately »Return the memo as the conflict set Smaller memos are more general and thus prune more failing branches Costlier memo-matching strategy --Clever indexing techniques available Set Enumeration Trees [Rymon, KRR92] UBTrees [Hoffman & Koehler, IJCAI-99] Allows generation of more effective memos at higher levels… Not possible with normal memoization

Speedup provided by EBL/DDB Times for GP+EBL include GC times while those for Graphplan DO NOT --CONSERVATIVE ESTIMATES OF POSSIBLE SPEEDUPS Experiments done on a 500MHZ Linux with Harmon Kardon Speakers

Speedups are correlated with memo-length reduction

Subset memoization alone does not help... -- Since EBL stores only the parts of the goal set that cause the failure it requires the ability to check if a stored memo is a superset of the current goal set -- However, subset memo-checking is not by itself enough to bring impressive savings in Graphplan performance

Utility issues with Graphplan Memos G EBL strategies typically suffer from significant utility problems –Cost of storing no-goods; Cost of matching the no- goods »Solver needs to selectively forget learned no-goods »(size-based learning; relevance based learning etc.) –Why is this not a significant issue with GP+EBL? »Reason: Memos correspond to a very conservative form of no-good learning

Memoization as a very conservative form of no-good learning G No-goods are compound assignments that cannot be part of a solution G Memos are subsets of variables at some level i that cannot be active together –Each m-sized memo corresponds to the conjunction of d m no- goods –There are O((d+2) n ) no-goods but only O(l*2 n/l )memos –Only the memos from the current level are checked during search

Sticky Values as a partial antidote to the conservatism of memos Idea: Whenever we skip over a variable V during DDB, we record the current value u of V. When we come back down to V (aftrer having re-assigned its ancestors), we first try the value u for V Leads up-to 4x further speedup over and above EBL/DDB Problem: Memoization ignores no-goods of the type

Utility of FC & DVO DVO and FC did not lead to significant improvements --Many problems that are solvable for EBL are still unsolvable for FC and DVO --Currently considering ordering heuristics based on distance metrics

Adding FC/DVO to EBL/DDB Adding DVO to EBL/DDB is straightforward Adding FC is tricky-- Conflict sets must contain the variables whose values lead to FC pruning FC/DVO can give a upto a further 2x speedup over EBL

EBL/DDB & Randomized Search –Random-restart systematic searches place limits on the number of backtracks, and a limit on number of restarts [Gomes, Selman, Kautz; AAAI-99] »Whenever the number of backtracks are exceeded, the search re-starts from the top of the search tree l Randomization is used so that a different part of the search tree is explored on different restarts –Implemented random-restart strategy for Graphplan »Limit the number of inter-level backtracks »Randomize the order in which the actions are considered for supporting a goal –EBL/DDB can help by getting more mileage out of the given backtrack/restarts limit

Evaluating the utility of EBL/DDB in Randomized Systematic Search --EBL/DDB allows Grarphplan with randomized search to get a significantly better solvability as well as greater quality (shorter) plans

Exploiting the déjà vu property of Graphplan backward search --Graphplan’s backward search in successive levels has a lot of symmetry and redundancy only a part of this is exploited by EBL memos --Idea: Store a larger trace of of a search tree at level k, and REPLAY it at level k+1 --EBL/DDB help make the trace much smaller [with Zimmerman, AAAI-99]

Abstracting Resources (Teasing apart Planning and Scheduling) G Most planners thrash by addressing planning and scheduling considerations together –Eg. Blocks world, with multiple robot hands G Idea: Abstract resources away during planning –Make assumption of infinite resources –Do a post-planning resource allocation phase –Re-plan if needed [with Srivastava, ECP-99]

Conclusions G Planning graph can be seen as a CSP problem –The Dynamic CSP model corresponds closely to Graphplan’s backward search G Adding EBL/DDB strategies to Graphplan’s backward search result in impressive speedups –EBL enables Graphplan to learn smaller and more useful memos –EBL/DDB capabilities are more useful than FC/DVO capabilities for Graphplan –Memos are a very conservative form of no-good learning »Sticky values help offset some disadvantages of this conservatism –EBL/DDB can also help in the context of random-restart search

EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University.

Similar presentations

Presentation on theme: "EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University.

Similar presentations

Presentation on theme: "EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University."— Presentation transcript:

Similar presentations

About project

Feedback