RePOP: Reviving Partial Order Planning

Slides:

Advertisements

Similar presentations

REVIEW : Planning To make your thinking more concrete, use a real problem to ground your discussion. –Develop a plan for a person who is getting out of.

Advertisements

Informed search algorithms

Constraint Satisfaction Problems Russell and Norvig: Chapter

Causal-link Planning II José Luis Ambite. 2 CS 541 Causal Link Planning II Planning as Search State SpacePlan Space AlgorithmProgression, Regression POP.

Fragment-Based Conformant Planning James Kurien Palo Alto Research Center Pandu NayakStratify, Inc. David E. SmithNASA Ames Research Center.

CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.

1 Constraint Satisfaction Problems A Quick Overview (based on AIMA book slides)

Artificial Intelligence Constraint satisfaction problems Fall 2008 professor: Luigi Ceccaroni.

1 Graphplan José Luis Ambite * [* based in part on slides by Jim Blythe and Dan Weld]

Solving Problem by Searching

Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.

Graph-based Planning Brian C. Williams Sept. 25 th & 30 th, J/6.834J.

Planning Graphs * Based on slides by Alan Fern, Berthe Choueiry and Sungwook Yoon.

Utilizing Problem Structure in Local Search: The Planning Benchmarks as a Case Study Jőrg Hoffmann Alberts-Ludwigs-University Freiburg.

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

Constraint Satisfaction CSE 473 University of Washington.

Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.

CSE 5731 Lecture 21 State-Space Search vs. Constraint- Based Planning CSE 573 Artificial Intelligence I Henry Kautz Fall 2001.

A: A Unified Brand-name-Free Introduction to Planning Subbarao Kambhampati Jan 28 th My lab was hacked and the systems are being rebuilt.. Homepage is.

18 th Feb Using reachability heuristics for PO planning Planning using Planning Graphs.

Constraint Satisfaction Problems

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

Reviving Integer Programming Approaches for AI Planning: A Branch-and-Cut Framework Thomas Vossen Leeds School of Business University of Colorado at Boulder.

Chapter 5 Outline Formal definition of CSP CSP Examples

RePOP: Reviving Partial Order Planning

Jonathon Doran. The Planning Domain A domain describes the objects, facts, and actions in the universe. We may have a box and a table in our universe.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Heuristics in Search-Space CSE 574 April 11, 2003 Dan Weld.

Informed search algorithms Chapter 4. Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most.

Planning as Propositional Satisfiabililty Brian C. Williams Oct. 30 th, J/6.834J GSAT, Graphplan and WalkSAT Based on slides from Bart Selman.

Review: Tree search Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search.

For Friday No reading Homework: –Chapter 11, exercise 4.

Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making Partial Order Planning Based on slides by: Carmel.

June 6 th, 2005 ICAPS-2005 Workshop on Constraint Programming for Planning and Scheduling 1/12 Stratified Heuristic POCL Temporal Planning based on Planning.

Chapter 5 Constraint Satisfaction Problems

AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)

RePOP: Reviving Partial Order Planning XuanLong Nguyen & Subbarao Kambhampati Yochan Group:

Best-first search Idea: use an evaluation function f(n) for each node –estimate of "desirability"  Expand most desirable unexpanded node Implementation:

© Daniel S. Weld 1 Logistics Travel Wed class led by Mausam Week’s reading R&N ch17 Project meetings.

Graphplan: Fast Planning through Planning Graph Analysis Avrim Blum Merrick Furst Carnegie Mellon University.

Graphplan CSE 574 April 4, 2003 Dan Weld. Schedule BASICS Intro Graphplan SATplan State-space Refinement SPEEDUP EBL & DDB Heuristic Gen TEMPORAL Partial-O.

Temporal Planning with Continuous Change J.Scott Penbrethy Daniel S. Weld Presented by - Parag.

IBM Labs in Haifa © 2005 IBM Corporation Assumption-based Pruning in Conditional CSP Felix Geller and Michael Veksler.

Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.

EBL & DDB for Graphplan (P lanning Graph as Dynamic CSP: Exploiting EBL&DDB and other CSP Techniques in Graphplan) Subbarao Kambhampati Arizona State University.

1 Chapter 6 Planning-Graph Techniques. 2 Motivation A big source of inefficiency in search algorithms is the branching factor  the number of children.

Review: Tree search Initialize the frontier using the starting state

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Inference and search for the propositional satisfiability problem

Planning as Satisfiability

Planning as Search State Space Plan Space Algorihtm Progression

Classical Planning via State-space search

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Constraint Satisfaction

CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12

Class #17 – Thursday, October 27

Graph-based Planning Slides based on material from: Prof. Maria Fox

Graphplan/ SATPlan Chapter

Informed search algorithms

Informed search algorithms

Class #19 – Monday, November 3

Chapter 6 Planning-Graph Techniques

Heuristics Local Search

CO Games Development 1 Week 8 Depth-first search, Combinatorial Explosion, Heuristics, Hill-Climbing Gareth Bellaby.

Constraint satisfaction problems

Graphplan/ SATPlan Chapter

Graphplan/ SATPlan Chapter

Constraint Satisfaction Problems

[* based in part on slides by Jim Blythe and Dan Weld]

Reading: Chapter 4.5 HW#2 out today, due Oct 5th

Presentation transcript:

RePOP: Reviving Partial Order Planning XuanLong Nguyen & Subbarao Kambhampati {xuanlong,rao}@asu.edu Yochan Group: http://rakaposhi.eas.asu.edu/yochan

In the beginning it was all POP. The good times return with RePOP Then it was cruelly UnPOPped In a way this paper describes a morality play.. A triumph of good over not-so-good. In the beginning, all planners were partial order planners. They came in great many varieties. They weren’t exactly nimble on their feet, but they were nice. You could write papers by making one solve a slightly larger blocks world problem. In short, there was peace. Then, cruel fate, bearing an uncanny resemblence to Drew McDermott intervened. The peaceful era of POP was cruelly ripped open by heuristic planners. These planners were from the other side of the tracks. They used--aaugh… it pains me to say this.. STATE SPACE SEARCH--something that everyone thought was put to rest by the sacerdotal piety of Earl. There was an implication that only State Space planners can be controlled with heuristics. The days of POP, they said, were over. NOT SO FAST… say we. We cleverly copt the same heuristics to repair, revive and REPOP the partial order planners. In the beginning it was all POP.

A recent (turbulent) history of planning 1995 1997 2000 - UCPOP [Penberthy &Weld] IxTeT [Ghallab et al] The whole world believed in POP and was happy to stack 6 blocks! Advent of CSP style compilation approach: Graphplan [Blum & Furst] SATPLAN [Kautz & Selman] Use of reachability analysis and Disjunctive constraints Domination of heuristic state search approach: HSP/R [Bonet & Geffner] UNPOP [McDermott]: POP is dead! Importance of good Domain-independent heuristics Hoffman’s FF – a state search planer swept through AIPS-00 competition! NASA’s highly publicized RAX still a POP dinosaur POP believed to be good framework to handle temporal and resource planning [Smith et al, 2000] UCPOP UNPOP RePOP

RePOP: A revival for partial order planning Outline RePOP: A revival for partial order planning To show that POP can be made very efficient by exploiting the same ideas that scaled up state search and Graphplan planners Effective heuristic search control Use of reachability analysis Handling of disjunctive constraints RePOP, implemented on top of UCPOP Dramatically better than all known partial-order planners Outperforms Graphplan and competitive with state search planners in many (parallel) domains

Partial plan representation POP background Partial plan representation P = (A,O,L,OC,UL) A: set of action steps in the plan S0 ,S1 ,S2 …,Sinf O: set of action ordering Si < Sj ,… L: set of causal links OC: set of open conditions (subgoals remain to be satisfied) UL: set of unsafe links where p is deleted by some action Sk I={q1 ,q2 } G={g1 ,g2 } p q1 S1 S3 g1 p Si Sj g2 S0 Sinf g2 p Si Sj oc1 oc2 S2 ~p Flaw: Open condition OR unsafe link Solution plan: A partial plan with no remaining flaw Every open condition must be satisfied by some action No unsafe links should exist (i.e. the plan is consistent)

Algorithm POP background g1 g2 S0 Sinf 1. Let P be an initial plan p 2. Flaw Selection: Choose a flaw f (either open condition or unsafe link) 3. Flaw resolution: If f is an open condition, choose an action S that achieves f If f is an unsafe link, choose promotion or demotion Update P Return NULL if no resolution exist 4. If there is no flaw left, return P else go to 2. 2. Plan refinement (flaw selection and resolution): p q1 S1 S3 g1 S0 Sinf g2 g2 oc1 oc2 S2 ~p Choice points Flaw selection (open condition? unsafe link?) Flaw resolution (how to select (rank) partial plan?) Action selection (backtrack point) Unsafe link selection (backtrack point)

Our approach (main ideas) State-space idea of distance heuristic 1. Ranking partial plans: use an effective distance-based heuristic estimator . 2. Exploit reachability analysis: use invariants to discover implicit conflicts in the plan. 3. Unsafe links are resolved by posting disjunctive ordering constraints into the partial plan: avoid unnecessary and exponential multiplication of failures due to promotion/demotion splitting CSP ideas of consistency enforcement

1. Ranking partial plans using distance-based heuristic h(P) = 2 1. Ranking Function: f(P) = g(P) + w h(P) g(P): number of actions in P h(P): estimate of number of new actions needed to refine P to become a solution plan w: increase the greediness of the heuristic search 2. Estimating h(P) h(P) is estimated by relaxing some constraints present in the partial plan P Negative effects of actions are relaxed P has no unsafe link flaws h(P) becomes the number of actions (cost(S) ) needed to achieve the set of open condition S from the initial state p S3 q1 S1 g1 S0 S5 g2 Sinf g2 oc1 oc2 S4 S2 ~p The basic idea is to relax certain constraints, and keep only the most important constraints in this estimation. Since h(P) is directly connected to the set of open conditions, and less so to the set of unsafe link, we would relax the set of unsafe link.

Distance-based heuristic estimate + Any state-space heuristic can be adapted + Relaxing negative effects makes the estimate inaccurate in serial domains. Estimate cost(S) 1. Build a planning graph PG from the initial state. 2. Cost(S) := 0 if all subgoals in S are in level 0. 3. Let p be a subgoal in S that appears last in PG. 4. Pick an action a in the graph that first achieves p 5. Update cost(S) := 1 + cost(S+Prec(a) – Eff(a)) 6. Replace S = S+Prec(a) – Eff(a), goto 2 p a 1 2 3 S S+Prec(a)-Eff(a)

2. Handling unsafe link flaws p Si Sj p 1. For each unsafe link threatened by another step Sk: Add disjunctive constraint to O Sk < Si V Si < Sj 2. Whenever a new ordering constraint is introduced to O, perform the constraint propagations: S1 < S2 V S3 < S4 ^ S4< S3  S1 < S2 S1 < S2 ^ S2 < S3  S1 < S3 S1 < S2 ^ S2 < S1  False Si Sj ~p q Prec(a) Sk Avoid the unnecessary exponential multiplication of failing partial plans

3. Detecting indirect conflicts using reachability analysis Reachability analysis to detect invariant: on(a,b) and clear(b) How to get state information in a partial plan 3. Cutset: Set of literals that must be true at some point during execution of plan For each action a, pre-C(Sk) = Prec(Sk) U {p | is a link and Si < Sk < Sj } post-C(Sk) = Eff(Sk) U {p | 4. If exists a cutset that violates of a variant, the partial plan is invalid and should be pruned p Si Sj q Sm Sn Prec(Sk) Sk Eff(Sk) p Si Sj Prec(Sk) + p + q Eff(Sk) + p + q p Si Sj Disadvantage: Inconsistency checking is passive and maybe expensive

Detecting indirect conflicts using reachability analysis Generalizing unsafe link: Sk threatens iff p is mutually exclusive (mutex) with either Prec(Sk) or Eff(Sk) Unsafe link is resolved by posting disjunctive constraints (as before) Sk < Si V Si < Sj p p Si Sj Si Sj q Sm Sn Prec(Sk) Sk Eff(Sk) Detects indirect conflicts early Derives more disjunctive constraints to be propagated

Experiments on RePOP RePOP is implemented on top of UCPOP planner using the three presented ideas Written in Lisp, runs on Linux, 500MHz, 250MB Compare RePOP against UCPOP, Graphplan and AltAlt in a number of benchmark domains Time Solution quality

Comparing planning time (summary) Repop vs. UCPOP Graphplan AltAlt RePOP is very good in parallel domains (gripper, logistics, rocket, parallel blocks world) Outperforms Graphplan in many domains Competitive with AltAlt Completely dominates UCPOP RePOP still inefficient in serial domains: Travel, Grid, 8-puzzle

Comparing planning time (time in seconds) Repop vs. UCPOP Graphplan AltAlt Comparing planning time (time in seconds) Problem UCPOP RePOP Graphplan AltAlt Gripper-8 - 1.01 66.82 .43 Gripper-10 2.72 47min 1.15 Gripper-20 81.86 15.42 Rocket-a 8.36 75.12 1.02 Rocket-b 8.17 77.48 1.29 Logistics-a 3.16 306.12 1.59 Logistics-b 2.31 262.64 1.18 Logistics-c 22.54 4.52 Logistics-d 91.53 20.62 Bw-large-a 45.78 (5.23) - 14.67 4.12 Bw-large-b (18.86) - 122.56 14.14 Bw-large-c (137.84) - 116.34

Some solution quality metrics Repop vs. UCPOP Graphplan AltAlt 1. Number of actions 2. Makespan: minimum completion time (number of time steps) 3. Flexibility: Average number of actions that do not have ordering constraints with other actions 1 3 Num_act=4 Makespan=2 Flex = 1 2 4 1 3 Num_act=4 Makespan=2 Flex = 2 2 4 1 2 3 4 Num_act=4 Makespan=4 Flex = 0

Comparing solution quality (summary) RePOP generates partially ordered plans Number of actions: RePOP typically returns shortest plans Number of time steps (makespan): Graphplan produces optimal number of time steps RePOP comes close Flexibility: RePOP typically returns the most flexible plans

Comparing solution quality Number of actions/ time steps Flexibility degree Problem RePOP Graphplan AltAlt Gripper-8 21/ 15 23/ 15 21/ 21 .57 .69 Gripper-10 27/ 19 29/ 19 27/ 27 .59 .61 Gripper-20 59/ 39 - 59/ 59 .68 Rocket-a 35/ 16 40/ 7 36/ 36 2.46 7.15 Rocket-b 34/15 30/ 7 34/ 34 7.29 4.80 Logistics-a 52/ 13 80/ 11 64/ 64 20.54 6.58 Logistics-b 42/ 13 79/ 13 53/ 53 20.0 5.34 Logistics-c 50/ 15 70/ 70 16.92 Logistics-d 69/ 33 85/ 85 22.84 Bw-large-a (8/5) - 11/ 4 9/ 9 2.75 2.0 Bw-large-b (11/8) - 18/ 5 11/ 11 3.28 2.67 Bw-large-c (17/ 10) - 19/ 19 5.06

Ablation studies CE: Consistency enforcement techniques (reachability analysis and disjunctive constraint handling HP: Distance-based heuristic Problem UCPOP + CE + HP +CE+HP (RePOP) Gripper-8 * 6557/ 3881 1299/ 698 Gripper-10 11407/ 6642 2215/ 1175 Gripper-12 17628/ 10147 3380/ 1776 Gripper-20 11097/ 5675 Rocket-a 30110/ 17768 7638/ 4261 Rocket-b 85316/ 51540 28282/ 16324 Logistics-a 411/ 191 847/ 436 Logistics-b 920/ 436 542/ 271 Logistics-c 4939/ 2468 7424/ 4796 Logistics-d 16572/ 10512

Conclusion Developed effective techniques for improving partial-order planners: Ranking partial plan heuristics, Disjunctive representation for unsafe links, Use of reachability analysis Presented and evaluated RePOP Brings POP to the realm of effective planning algorithms Can now exploit the flexibility of POP without too much efficiency penalty

Future Work Extend RePOP to deal with time and resource constraints Extend RePOP to deal with partially instantiated actions Improve the efficiency of RePOP in serial domains Serial domains inherent weakness of POP? Real-world domains tend to admit partially ordered plans Devising effective admissible heuristics for POP