Planning (AIMA Ch. 10) Planning problem defined Simple planning agent Types of plans Graph planning
What we have so far Can TELL KB about new percepts about the world KB maintains model of the current world state Can ASK KB about any fact that can be inferred from KB How can we use these components to build a planning agent? i.e., an agent that constructs plans that can achieve its goals, and that then executes these plans?
Planning Problem Find a sequence of actions that achieves a given goal when executed from a given initial world state. That is, given a set of operator descriptions (defining the possible primitive actions by the agent), an initial state description, and a goal state description or predicate, compute a plan, which is a sequence of operator instances, such that executing them in the initial state will change the world to a state satisfying the goal-state description. Goals are usually specified as a conjunction of subgoals to be achieved
Planning vs. Problem Solving Planning and problem solving methods can often solve the same sorts of problems Planning is more powerful because of the representations and methods used States, goals, and actions are decomposed into sets of sentences (usually in first-order logic) Search often proceeds through plan space rather than state space (though there are also state-space planners) Subgoals can be planned independently, reducing the complexity of the planning problem
Remember: Problem-Solving Agent function SIMPLE-PROBLEM-SOLVING-AGENT( percept) returns an action static: seq, an action sequence, initially empty state, some description of the current world state goal, a goal, initially null problem, a problem formulation state f.. UPDATE-STATE(state, percept) if seq is empty then goal f.. FORMULATE-GOAL(state) // What is the current state? // From LA to San Diego (given curr. state) problem f.. FORMULATE-PROBLEM(state, goal) seq f.. SEARCH( problem) action f.. RECOMMENDATION(seq, state) // e.g., Gas usage seq f.. REMAINDER(seq, state) return action // If fails to reach goal, update Note: This is offline problem-solving. Online problem-solving involves acting w/o complete knowledge of the problem and environment
Simple planning agent Use percepts to build model of current world state IDEAL-PLANNER: Given a goal, algorithm generates plan of action STATE-DESCRIPTION: given percept, return initial state description in format required by planner MAKE-GOAL-QUERY: used to ask KB what next goal should be
A Simple Planning Agent function SIMPLE-PLANNING-AGENT(percept) returns an action static: KB, a knowledge base (includes action descriptions) p, a plan (initially, NoPlan) t, a time counter (initially 0) local variables:G, a goal current, a current state description TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t)) current STATE-DESCRIPTION(KB, t) if p = NoPlan then G ASK(KB, MAKE-GOAL-QUERY(t)) p IDEAL-PLANNER(current, G, KB) if p = NoPlan or p is empty then action NoOp else action FIRST(p) p REST(p) TELL(KB, MAKE-ACTION-SENTENCE(action, t)) t t+1 return action Like popping from a stack
Goal of Planning Choose actions to achieve a certain goal But isn’t it exactly the same goal as for problem solving? Some difficulties with problem solving: The successor function is a black box: it must be “applied” to a state to know which actions are possible in that state and what are the effects of each one
B • Suppose that the goal is HAVE(milk). Goal of Planning Choose actions to achieve a certain goal B • Suppose that the goal is HAVE(milk). ut isn’t it exactly the same goal as for oblem solving? ome difficulties with problem solving: From some initial state where HAVE(milk) is not satisfied, prthe successor function must be repeatedly applied to S eventually generate a state where HAVE(milk) is satisfied. An explicit representation of the possible actions and their effects would help the problem solver select the relevant actions. possible in that state and what are the effects of eac h one Otherwise, in the real world an agent would be overwhelmed by irrelevant actions
Goal of Planning Choose actions to achieve a certain goal But isn’t it exactly the same goal as for problem solving? Some difficulties with problem solving: The goal test is another black-box function, states are domain-specific data structures, and heuristics must be supplied for each new problem
Goal of Planning Choose actions to achieve a certain goal But isn’t it exactly the same goal as for problem solving? Some difficulties with problem solving: The goal test is another black-box function, states are domain-specific data structures, and heuristics must be supplied for each new problem • Suppose that the goal is HAVE(milk) ^ HAVE(book). • Without an explicit representation of the goal, the problem solver cannot know that a state where HAVE(milk) is already achieved is more promising than a state where neither HAVE(milk) nor HAVE(book) is achieved.
Goal of Planning Choose actions to achieve a certain goal But isn’t it exactly the same goal as for problem solving? Some difficulties with problem solving: The goal may consist of several nearly independent subgoals, but there is no way for the problem solver to know it
Goal of Planning Choose actions to achieve a certain goal But isn’t it exactly the same goal as for pro So blem solving? me difficulties with problem solvin HAVE(milk) and HAVE(book) may be achieved by two nearly independent sequences of actions g: The goal may consist of several nearly independent subgoals, but there is no way for the problem solver to know it
Representations in Planning Planning opens up the black boxes by using logic to represent: Actions States Goals Problem solving Logic representation Planning
How to represent Planning Problem Classical planning: we will start with problems that are full observable, deterministic, statics environment with single agents
PDDL PDDL = Planning Domain Definition Language ← standard encoding language for “classical” planning tasks Components of a PDDL planning task: Objects: Things in the world that interest us. Predicates: Properties of objects that we are interested in; can be true or false. Initial state: The state of the world that we start in. Goal specification: Things that we want to be true. Actions/Operators: Ways of changing the state of the world. PDDL can describe what we need for search algorithm: 1) initial state 2) actions that are available in a state 3) result of applying an action 4) and goal test
PDDL operators States: Represented as a conjunction of fluent (ground, functionless atoms). Following are not allowed: At(x, y) (non-ground), ¬Poor (negation), and At(Father(Fred), Sydney) (use function symbol) Use closed-world assumption: Any fluent that are not mentioned are false. Use unique names assumption: States named differently are distinct. *Fluents: Predicates and functions whose values vary from by time (situation). *Ground term: a term with no variables. *Literal: atomic sentence. Initial state is conjunction of ground atoms. Goal is also a conjunction (∧) of literal (positive or negative) that may contain variables.
PDDL operators cont. Actions: described by a set of action schemas that implicitly define the ACTIONS(s) and RESULT(s, a). Action schema: composed of 1) action name, 2) list of all the variable used in the schema, 3)a precondition, and 4) an effect Action(Fly(p, from, to), PRECOND: At(p, from) ∧ Plane(p) ∧ Airport(from) ∧ Airport(to) EFFECT: ¬At(p, from) ∧ At(p, to) Value substituted for the variables: Action(Fly(P1, SFO, JFK), PRECOND: At(P1, SFO) ∧ Plane(P1) ∧ Airport(SFO) ∧ Airport(JFK) EFFECT: ¬At(P1, SFO) ∧ At(p, JFK) Action a is applicable in state s if the preconditions are satisfied by s. *preconditions: Condition which makes an action possible.
PDDL operators cont. Result of executing action a in state s is defined as a state s′ which is represented by the set of fluents formed by starting with s, removing the fluents what appear as negative literals in the action’s effects. For example: in Fly(P1, SFO, JFK) we remove At(P1, SFO) and add At(P1, JFK). a ∈ Actions(s) ⇔ s |= Precond(a) Result(s, a) = (s − Del(a)) ∪ Add(a) where Del(a) is the list of literals which appear negatively in the effect of a, and Add(a) is the list of positive literals in the effect of a. * The precondition always refers to time t and the effect to time t + 1.
PDDL: Cargo transportation planning problem
Major approaches Planning graphs State space planning Situation calculus Partial order planning Hierarchical decomposition (HTN planning) Reactive planning
Basic representations for planning Classic approach first used in STRIPS planner circa 1970 States represented as a conjunction of ground literals at(Home) Goals are conjunctions of literals, but may have existentially quantified variables at(?x) ^ have(Milk) ^ have(bananas) ... Do not need to fully specify state Non-specified either don’t-care or assumed false Represent many cases in small storage Often only represent changes in state rather than entire situation Unlike theorem prover, not seeking whether the goal is true, but is there a sequence of actions to attain it
Planning Problems A C A B B Start State C Goal On(B, C) On(A, B) Sparse encoding, but complete state spec On(C, A) On(A, Table) On(B, Table) Clear(C) Clear(B) A C A B Start State B C Goal Set of goal states, only requirements specified (think unary constraints) On(B, C) On(A, B) Action schema, instantiates to give specific ground actions Which goal first? ACTION: MoveToTable(b,x) PRECONDITIONS: On(b,x), Clear(b), Block(b), Block(x), (bx) POSTCONDITIONS: On(b,Table), Clear(x) On(b,x)
Operator/action representation Operators contain three components: Action description Precondition - conjunction of positive literals Effect - conjunction of positive or negative literals which describe how situation changes when operator is applied Example: Op[Action: Go(there), Precond: At(here) ^ Path(here,there), Effect: At(there) ^ ~At(here)] All variables are universally quantified Situation variables are implicit At(here) Path(here,there) Go(there) At(there) ~At(here) preconditions must be true in the state immediately before operator is applied; effects are true immediately after
Types of Plans C Sequential Plan A B Start State Sequential Plan MoveToTable(C,A) > Move(B,Table,C) > Move(A,Table,B) On(C, A) On(A, Table) On(B, Table) Clear(C) Clear(B) Block(A) … Partial-Order Plan MoveToTable(C,A) > Move(A,Table,B) Move(B,Table,C)
Forward Search Forward (progression) state-space search: ♦ Search through the space of states, starting in the initial state and using the problem’s actions to search forward for a member of the set of goal states. ♦ Prone to explore irrelevant actions ♦ Planning problems often have large state space. ♦ The state-space is concrete, i.e. there is no unassigned variables in the states.
Forward Search C A B Start State A B C Applicable actions On(C, A) On(A, Table) On(B, Table) Clear(C) Clear(B) Block(A) MoveToBlock(C,A,B) Applicable actions
Backward Search Backward (regression) relevant-state search: ♦ Search through set of relevant states, staring at the set of states representing the goal and using the inverse of the actions to search backward for the initial state. ♦ Only considers actions that are relevant to the goal (or current state). ♦ Works only when we know how to regress from a state description to the predecessor state description. ♦ Need to deal with partially instantiated actions and state, not just ground ones. (Some variable may not be assigned values.)
Backward Search A B ? B MoveToBlock(A,Table,B) MoveToBlock(A,x’,B) C A ACTION: MoveToBlock(b,x,y) PRECONDITIONS: On(b,x), Clear(b), Clear(y), Block(b), Block(y), (bx), (by), (xy) POSTCONDITIONS: On(b,y), Clear(x) On(b,x), Clear(y) A B ? B MoveToBlock(A,Table,B) MoveToBlock(A,x’,B) C A C Goal State On(B, C) On(A, B) Relevant actions
Planning “Tree” Have=T, Ate=F {Eat} {} Have=F , Ate=T Have=T , Ate=F Start: HaveCake Have=T, Ate=F Goal: AteCake, HaveCake {Eat} {} Action: Eat Pre: HaveCake Add: AteCake Have=F , Ate=T Have=T , Ate=F Del: HaveCake {Bake} {} {Eat} {} Have=T, Ate=T Have=F, Ate=T Have=F, Ate=T Have=T, Ate=F Action: Bake Pre: HaveCake Add: HaveCake
Reachable State Sets Have=T Ate=F Have=T, Ate=F Have=T, Ate=F {Eat} {} Have=F, Ate=T Have=T Ate=F Have=F, Ate=T Have=T, Ate=F {Bake} {} {Eat} {} Have=T, Ate=T Have=F, Ate=T Have=F, Ate=T Have=T, Ate=F Have=T, Ate=T Have=F, Ate=T Have=T, Ate=F
Approximate Reachable Sets Have=T, Ate=F Have={T}, Ate={F} Have=F, Ate=T Have=T, Ate=F Have={T,F}, Ate={T,F} (Have, Ate) not (T,T) (Have, Ate) not (F,F) Have=T, Ate=T Have=F, Ate=T Have=T, Ate=F Have={T,F}, Ate={T,F} (Have,Ate) not (F,F)
Planning Graphs HaveCake HaveCake Eat HaveCake AteCake AteCake Start: HaveCake Goal: AteCake, HaveCake HaveCake HaveCake Action: Eat Pre: HaveCake Add: AteCake Del: HaveCake Eat HaveCake AteCake Action: Bake Pre: HaveCake Add: HaveCake AteCake AteCake S0 A0 S1
Mutual Exclusion (Mutex) NEGATION Literals and their negations can’t be true at the HaveCake HaveCake Eat HaveCake same time AteCake P AteCake AteCake P S0 A0 S1
Mutual Exclusion (Mutex) INCONSISTENT EFFECTS An effect of one negates the effect of the other HaveCake HaveCake Eat HaveCake AteCake AteCake AteCake S0 A0 S1
Mutual Exclusion (Mutex) INCONSISTENT SUPPORT All pairs of actions that achieve two literals are mutex HaveCake HaveCake Eat HaveCake AteCake AteCake AteCake S0 A0 S1
Planning Graph Bake HaveCake HaveCake HaveCake Eat HaveCake Eat AteCake AteCake AteCake AteCake AteCake S0 A0 S1 A1 S2
Mutual Exclusion (Mutex) COMPETITION Preconditions are mutex; cannot both hold INCONSISTENT EFFECTS An effect of one negates the effect of the other
Mutual Exclusion (Mutex) INTERFERENCE One deletes a precondition of the other
Planning Graph
Observation 1
Observation 2 Actions monotonically increase (if they applied before, they still do)
Observation 3 Proposition mutex relationships monotonically decrease p q q q A r r r … … … Proposition mutex relationships monotonically decrease
Observation 4 Action mutex relationships monotonically decrease A A A q q q q B B B … r r r C C C s s s … … … Action mutex relationships monotonically decrease
Observation 5 Claim: planning graph “levels off” After some time k all levels are identical Because it’s a finite space, the set of literals cannot increase indefinitely, nor can the mutexes decrease indefinitely Claim: if goal literal never appears, or goal literals never become non-mutex, no plan exists If a plan existed, it would eventually achieve all goal literals (and remove goal mutexes – less obvious) Converse not true: goal literals all appearing non-mutex does not imply a plan exists
Heuristics: Ignore Preconditions Relax problem by ignoring preconditions Can drop all or just some preconditions Can solve in closed form or with set-cover methods
Heuristics: No-Delete Relax problem by not deleting falsified literals Can’t undo progress, so solve with hill-climbing (non-admissible) On(C, A) On(A, Table) On(B, Table) C A B Clear(C) Clear(B) ACTION: MoveToBlock(b,x,y) PRECONDITIONS: On(b,x), Clear(b), Clear(y), Block(b), Block(y), (bx), (by), (xy) POSTCONDITIONS: On(b,y), Clear(x) On(b,x), Clear(y)
Heuristics: Independent Goals Independent subgoals? A Partition goal literals Find plans for each subset cost(all) < cost(any)? cost(all) < sum-cost(each)? B C Goal State On(B, C) On(A, B) On(A, B) On(B, C)
Heuristics: Level Costs Planning graphs enable powerful heuristics Level cost of a literal is the smallest S in which it appears Max-level: goal cannot be realized before largest goal conjunct level cost (admissible) Sum-level: if subgoals are independent, goal cannot be realized faster than the sum of goal conjunct level costs (not admissible) Set-level: goal cannot be realized before all conjuncts are non-mutex (admissible) Bake HaveCake HaveCake HaveCake Eat HaveCake Eat HaveCake AteCake AteCake AteCake AteCake AteCake S0 A0 S1 A1 S2
Graphplan Graphplan directly extracts plans from a planning graph Graphplan searches for layered plans (often called parallel plans) More general than totally-ordered plans, less general than partially-ordered plans A layered plan is a sequence of sets of actions actions in the same set must be compatible all sequential orderings of compatible actions gives same result A C ? D B B D A C Layered Plan: (a two layer plan) move(A,B,TABLE) move(C,D,TABLE) ; move(B,TABLE,A) move(D,TABLE,C)
Solution Extraction: Backward Search Search problem: Start state: goal set at last level Actions: conflict-free ways of achieving the current goal set Terminal test: at S0 with goal set entailed by initial planning state Note: may need to start much deeper than the leveling-off point! Caching, good ordering is important
An example domain: the blocks world The domain consists of: A table, a set of cubic blocks and a robot arm; Each block is either on the table or stacked on top of another block The arm can pick up a block and move it to another position either on the table or on top of another block; The arm can only pick up one block at time, so it cannot pick up a block which has another block on top. A goal is a request to build one or more stacks of blocks specified in terms of what blocks are on top of what other blocks
State descriptions Blocks are represented by constants A, B, C, . . . etc. and an additional constant T able representing the table. The following predicates are used to describe states: On(b, x) block b is on x, where x is either another block or the table Clear(x) there is a clear space on x to hold a block
PDDL: Block-world problem A B C C B A Start State Goal State Initial state: On(C, A) ∧ OnT able(A) ∧ OnT able(B) ∧ Clear(B) ∧ Clear(C) Goal: On(A, B) ∧ On(B, C) ∧ OnT able(C)
Actions schemas Actions schemas: MOVE(b, x, y): Precond: On(b, x) ∧ Clear(b) ∧ Clear(y) Effect: On(b, y) ∧ Clear(x)∧ ¬On(b, x)∧¬Clear(y) MOVE-TO-TABLE(b, x): Precond: On(b, x) ∧ Clear(b) Effect: On(b, Table) ∧ Clear(x) ∧ ¬On(b, x)
PDDL: Block-world problem A B C C B A Start State Goal State Initial state: On(C, A) ∧ OnT able(A) ∧ OnT able(B) ∧ Clear(B) ∧ Clear(C) Goal: On(A, B) ∧ On(B, C) ∧ OnT able(C)
Goal stack planning We work backwards from the goal, looking for an operator which has one or more of the goal literals as one of its effects and then trying to satisfy the preconditions of the operator. The preconditions of the operator become sub-goals that must be satisfied. We keep doing this until we reach the initial state. Goal stack planning uses a stack to hold goals and actions to satisfy the goals, and a knowledge base to hold the current state, action schemas and domain axioms Goal stack is like a node in a search tree; if there is a choice of action, we create branches
Goal stack planning pseudo-code Push the original goal on the stack. Repeat until the stack is empty: If stack top is a compound goal, push its unsatisfied sub-goals on the stack. If stack top is a single unsatisfied goal, replace it by an action that makes it satisfied and push the actions precondition on the stack. If stack top is an action, pop it from the stack, execute it and change the knowledge base by the actions effects. If stack top is a satisfied goal, pop it from the stack.
Goal stack planning 1 (We do pushing the sub-goals at the same step as the compound goal.) The order of sub-goals is up to us; we could have put On(B,C) on top On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(C,A),OnTable(A),OnTable(B),Clear(B), Clear(C)} plan = [ ] The top of the stack is a single unsatisfied goal. We push the action which would achieve it, and its preconditions.
Goal stack planning 2 On(A, Table) Clear(B) Clear(A) On(A, Table) ∧ Clear(A) ∧ Clear(B) MOVE(A, x,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(C,A),OnTable(A),OnTable(B),Clear(B), Clear(C)} plan = [ ] The top of the stack is a satisfied goal. We pop the stack (twice).
Goal stack planning 3 Clear(A) On(A, Table) ∧ Clear(A) ∧ Clear(B) MOVE(A, Table,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(C,A),OnTable(A),OnTable(B),Clear(B), Clear(C)} plan = [ ] The top of the stack is an unsatisfied goal. We push the action which would achieve it, and its preconditions.
Goal stack planning 4 On(C,A) Clear(C) On(C,A) ∧ Clear(C) MOVE-TO-TABLE(C,A) Clear(A) On(A, Table) ∧ Clear(A) ∧ Clear(B) MOVE(A, Table,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(C,A),OnTable(A),OnTable(B),Clear(B), Clear(C)} plan = [ ] The top of the stack is a satisfied goal. We pop the stack (three times).
Goal stack planning 5 MOVE-TO-TABLE(C,A) Clear(A) On(A, Table) ∧ Clear(A) ∧ Clear(B) MOVE(A, Table,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(C,A),OnTable(A),OnTable(B),Clear(B), Clear(C)} plan = [ ] The top of the stack is an action. We execute it, update the KB with its effects, and add it to the plan.
Goal stack planning 6 Clear(A) On(A, Table) ∧ Clear(A) ∧ Clear(B) MOVE(A, Table,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {OnTable(C),OnTable(A),OnTable(B),Clear(A), Clear(B), Clear(C)} plan = [MOVE-TO-TABLE(C,A)] The top of the stack is a satisfied goal. We pop the stack (twice).
Goal stack planning 7 MOVE(A, Table,B) On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {OnTable(C),OnTable(A),OnTable(B),Clear(A), Clear(B), Clear(C)} plan = [MOVE-TO-TABLE(C,A)] The top of the stack is an action. We execute it, update the KB with its effects, and add it to the plan.
Goal stack planning 8 On(A,B) On(B,C) OnTable(C) On(A,B) ∧ On(B,C) ∧ OnTable(C) KB = {On(A,B), OnTable(C), OnTable(B), Clear(A), Clear(C)} plan = [MOVE-TO-TABLE(C,A),MOVE(A, Table,B)] the current state is
Goal stack planning 9 If we follow the same process for the On(B,C) goal, we end up in the state On(B,C) ∧ OnTable(A) ∧ OnTable(C) The remaining goal on the stack, On(A,B) ∧ On(B,C) ∧ OnTable(C) is not satisfied. So On(A,B) will be pushed on the stack again.
Goal stack planning 10 Now we finally can move A on top of B, but the resulting plan is redundant: MOVE-TO-TABLE(C,A) MOVE(A, Table,B) MOVE-TO-TABLE(A,B) MOVE(B, Table,C) There are techniques for ‘fixing’ inefficient plans (where something is done and then undone) but it is difficult in general (when it is not straight one after another)