Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Artificial Intelligence – Unit 6 Planning Course 67842 The Hebrew University of Jerusalem School of Engineering and Computer Science Academic.

Similar presentations


Presentation on theme: "Introduction to Artificial Intelligence – Unit 6 Planning Course 67842 The Hebrew University of Jerusalem School of Engineering and Computer Science Academic."— Presentation transcript:

1 Introduction to Artificial Intelligence – Unit 6 Planning Course 67842 The Hebrew University of Jerusalem School of Engineering and Computer Science Academic Year: 2008/2009 Instructor: Jeff Rosenschein (Chapter 11, “Artificial Intelligence: A Modern Approach”)

2 Outline  The Planning problem  STRIPS  Planning with State-space search  Partial-order planning  Planning graphs  Planning with propositional logic  Analysis of planning approaches 2

3 What is Planning  Generate sequences of actions to perform tasks and achieve objectives  States, actions and goals  Search for solution over abstract space of plans  Classical planning environment: fully observable, deterministic, finite, static and discrete  Assists humans in practical applications  design and manufacturing  military operations  games  space exploration 3

4 What is Means-End Reasoning?  Basic idea is to give an agent:  representation of goal/intention to achieve  representation of actions it can perform  representation of the environment and have it generate a plan to achieve the goal  Essentially, this is automatic programming 4

5 goal/ intention/ task state of environment possible action planner plan to achieve goal 5

6 Two Overarching Issues  Representation – How do we represent...  goal to be achieved  state of environment  actions available to agent  plan itself  Control (search) strategy – how do we search through:  set of possible states of the environment, or alternatively…  set of possible plans 6

7 The Blocks World  We’ll illustrate the techniques with reference to the blocks world  Contains a robot arm, 3 blocks (A, B, and C) of equal size, and a table-top A BC 7

8 The Blocks World Ontology  To represent this environment, need an ontology On(x, y) obj x on top of obj y OnTable(x) obj x is on the table Clear(x) nothing is on top of obj x Holding(x) arm is holding x 8

9 The Blocks World  Here is a representation of the blocks world described above: Clear(A) On(A, B) OnTable(B) OnTable(C)  Use the closed world assumption: anything not stated is assumed to be false 9

10 The Blocks World  A goal is represented as a set of formulae  Here is a goal: OnTable(A)  OnTable(B)  OnTable(C) A BC 10

11 The Blocks World  Actions are represented using a technique that was developed in the STRIPS planner  Each action has:  a name which may have arguments  a pre-condition list list of facts which must be true for action to be executed  a delete list list of facts that are no longer true after action is performed  an add list list of facts made true by executing the action Each of these may contain variables 11

12 The Blocks World Operators  Example 1: The stack action occurs when the robot arm places the object x it is holding is placed on top of object y. Stack(x, y) pre Clear(y)  Holding(x) del Clear(y)  Holding(x) add ArmEmpty  On(x, y) A B 12

13 The Blocks World Operators  Example 2: The unstack action occurs when the robot arm picks an object x up from on top of another object y. UnStack(x, y) pre On(x, y)  Clear(x)  ArmEmpty del On(x, y)  ArmEmpty add Holding(x)  Clear(y) Stack and UnStack are inverses of one-another. A B 13

14 The Blocks World Operators  Example 3: The pickup action occurs when the arm picks up an object x from the table. Pickup(x) pre Clear(x)  OnTable(x)  ArmEmpty del OnTable(x)  ArmEmpty add Holding(x)  Example 4: The putdown action occurs when the arm places the object x onto the table. Putdown(x) pre Holding(x) del Holding(x) add Clear(x)  OnTable(x)  ArmEmpty 14

15 A Plan  What is a plan? A sequence (list) of actions, with variables replaced by constants. IG a1 a17 a142 15

16 The STRIPS approach  The original STRIPS system used a goal stack to control its search  The system has a database and a goal stack, and it focuses attention on solving the top goal (which may involve solving subgoals, which are then pushed onto the stack, etc.) 16

17 The Basic STRIPS Idea  Place goal on goal stack:  Considering top Goal1, place onto it its subgoals:  Then try to solve subgoal GoalS1-2, and continue… Goal1 GoalS1-2 GoalS1-1 17

18 Stack Manipulation Rules, STRIPS If on top of goal stack:Then do: Compound or single goalRemove it matching the current state description Compound goal not matching1. Keep original compound goal on stack the current state description2. List the unsatisfied component goals on the stack in some new order Single-literal goal not matching theFind rule whose instantiated current state descriptionadd-list includes the goal, and 1. Replace the goal with the instantiated rule; 2. Place the rule’s instantiated precondition formula on top of stack Rule1. Remove rule from stack; 2. Update database using rule; 3. Keep track of rule (for solution) NothingStop “Underspecified” – there are decision branches here within the search tree… 18

19 Ex: A C B A C B I:I: G:G: 1. Place on stack original goal: Stack: On(A,C)  On(C,B) CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database: 19

20 2. Since top goal is unsatisfied compound goal, list its unsatisfied subgoals on top of it: Stack: On(A,C)  On(C,B) On(A,C) On(C,B) CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database (unchanged): 20

21 3. Since top goal is unsatisfied single-literal goal, find rule whose instantiated add-list includes the goal, and: a. Replace the goal with the instantiated rule; b. Place the rule’s instantiated precondition formula on top of stack Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database (unchanged): 21

22 4. Since top goal is unsatisfied compound goal, list its subgoals on top of it: Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) Holding(C) Clear(B) CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database (unchanged): 22

23 5. Single goal on top of stack matches data base, so remove it: Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) Holding(C) CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database (unchanged): 23

24 6. Since top goal is unsatisfied single-literal goal, find rule whose instantiated add-list includes the goal, and: a. Replace the goal with the instantiated rule; b. Place the rule’s instantiated precondition formula on top of stack On(C,A)  Clear(C)  Handempty Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) unstack(C,A) Database: (unchanged) 24

25 7. Compound goal on top of stack matches data base, so remove it: CLEAR(B) ON(C,A) CLEAR(C) ONTABLE(A) ONTABLE(B) HANDEMPTY Database (unchanged): Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) unstack(C,A) 25

26 8. Top item is rule, so: a. Remove rule from stack; b. Update database using rule; c. Keep track of rule (for solution) Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) Holding(C)  Clear(B) CLEAR(B) ONTABLE(A) ONTABLE(B) HOLDING(C) CLEAR(A) Database: Solution: {unstack(C,A)} 26

27 9. Compound goal on top of stack matches data base, so remove it: Stack: On(A,C)  On(C,B) On(A,C) stack(C,B) CLEAR(B) ONTABLE(A) ONTABLE(B) HOLDING(C) CLEAR(A) Database (unchanged): Solution: {unstack(C,A)} 27

28 10. Top item is rule, so: a. Remove rule from stack; b. Update database using rule; c. Keep track of rule (for solution) Stack: On(A,C)  On(C,B) On(A,C) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(A) CLEAR(C) ON(C,B) Database: Solution: {unstack(C,A), stack(C,B)} 28

29 11. Since top goal is unsatisfied single-literal goal, find rule whose instantiated add-list includes the goal, and: a. Replace the goal with the instantiated rule; b. Place the rule’s instantiated precondition formula on top of stack a. Replace the goal with the instantiated rule; b. Place the rule’s instantiated precondition formula on top of stack Stack: On(A,C)  On(C,B) stack(A,C) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(A) CLEAR(C) ON(C,B) Database (unchanged): Solution: {unstack(C,A), stack(C,B)} Holding(A)  Clear(C) 29

30 12. Since top goal is unsatisfied compound goal, list its unsatisfied subgoals on top of it: Stack: On(A,C)  On(C,B) stack(A,C) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(A) CLEAR(C) ON(C,B) Database (unchanged): Holding(A)  Clear(C) Holding(A) Solution: {unstack(C,A), stack(C,B)} 30

31 13. Since top goal is unsatisfied single-literal goal, find rule whose instantiated add-list includes the goal, and: a. Replace the goal with the instantiated rule; b. Place the rule’s instantiated precondition formula on top of stack Stack: On(A,C)  On(C,B) stack(A,C) Holding(A)  Clear(C) pickup(A) Ontable(A)  Clear(A)  Handempty Database: (unchanged) Solution: {unstack(C,A), stack(C,B)} 31

32 14. Compound goal on top of stack matches data base, so remove it: Stack: On(A,C)  On(C,B) stack(A,C) Holding(A)  Clear(C) pickup(A) ONTABLE(A) ONTABLE(B) HANDEMPTY CLEAR(A) CLEAR(C) ON(C,B) Database (unchanged): Solution: {unstack(C,A), stack(C,B)} 32

33 15. Top item is rule, so: a. Remove rule from stack; b. Update database using rule; c. Keep track of rule (for solution) Stack: On(A,C)  On(C,B) stack(A,C) Holding(A)  Clear(C) ONTABLE(B) ON(C,B) CLEAR(C) HOLDING(A) Database: Solution: {unstack(C,A), stack(C,B), pickup(A)} 33

34 16. Compound goal on top of stack matches data base, so remove it: Stack: On(A,C)  On(C,B) stack(A,C) ONTABLE(B) ON(C,B) CLEAR(C) HOLDING(A) Database (unchanged): Solution: {unstack(C,A), stack(C,B), pickup(A)} 34

35 17. Top item is rule, so: a. Remove rule from stack; b. Update database using rule; c. Keep track of rule (for solution) Stack: On(A,C)  On(C,B) ONTABLE(B) ON(C,B) ON(A,C) CLEAR(A) HANDEMPTY Database: Solution: {unstack(C,A), stack(C,B), pickup(A), stack(A,C)} 35

36 18. Compound goal on top of stack matches data base, so remove it: Stack: ONTABLE(B) ON(C,B) ON(A,C) CLEAR(A) HANDEMPTY Database (unchanged): Solution: {unstack(C,A), stack(C,B), pickup(A), stack(A,C)} 19. Stack is empty, so stop. 36

37 Ex: A C B A C B I:I: G:G: In solving this problem, we took some shortcuts—we branched in the right direction every time. In practice, searching can be guided by 1. Heuristic information (e.g., try to achieve “HOLDING(x)” last) 2. Detecting unprofitable paths (e.g., when the newest goal set has become a superset of the original goal set) 3. Considering useful operator side effects (by scanning down the stack) 37

38 Sometimes even this doesn’t help… A C B A C B I:I:G:G: ON(C,A)  ONTABLE(A)  ONTABLE(B)  ARMEMPTY ON(A,B)  ON(B,C) It could try ON(B,C) first, then ON(A,B)— but it will find that the first goal has been undone. So the first goal will be added back onto the stack and solved. 38

39 The final sequence is inefficient: I PICKUP(B) STACK(B,C) UNSTACK(B,C) PUTDOWN(B) UNSTACK(C,A) PUTDOWN(C) PICKUP(A) STACK(A,B) UNSTACK(A,B) PUTDOWN(A) PICKUP(B) STACK(B,C) PICKUP(A) STACK(A,B) G 1. Trying the goals in the other order doesn’t help much. 2. We can remove adjacent operators that undo each other. 39

40 What we really want is to: 1. Begin work on ON(A,B) by clearing A (i.e., putting C on table) 2. Achieve ON(B,C) by stacking B on C 3. Achieve [finish] ON(A,B) by stacking A on B. We couldn’t do this using a stack, but we can if we use a set of goals. 40

41 Short aside: STRIPS cannot solve all goals Two memory registers X and Y whose initial contents are A and B respectively. I: Contents(X, A)  Contents(Y, B) We have one operation, Assign(u, r, t, s)P: Contents(r, s)  Contents(u, t) D: Contents(u, t)A: Contents(u, s) Our goal state is: G: Contents(X, B)  Contents(Y, A) STRIPS cannot defer its solution of either subgoal long enough not to erase the other register’s contents. 41

42 Beyond STRIPS  STRIPS representation of action was very influential for subsequent Planning research  STRIPS control strategy (search strategy, using a stack) was not influential in subsequent Planning research 42

43 Difficulty of real world problems  Assume a problem-solving agent using some search method …  Which actions are relevant?  Exhaustive search vs. backward search  What is a good heuristic functions?  Good estimate of the cost of the state?  Problem-dependent vs, -independent  How to decompose the problem?  Most real-world problems are nearly decomposable 43

44 Planning language  What is a good language?  Expressive enough to describe a wide variety of problems  Restrictive enough to allow efficient algorithms to operate on it  Planning algorithm should be able to take advantage of the logical structure of the problem  STRIPS and ADL 44

45 General language features  Representation of states  Decompose the world in logical conditions and represent a state as a conjunction of positive literals  Propositional literals: Poor  Unknown  FO-literals (grounded and function-free): At(Plane1, Melbourne)  At(Plane2, Sydney)  Closed world assumption  Representation of goals  Partially specified state and represented as a conjunction of positive ground literals  A goal is satisfied if the state contains all literals in goal 45

46 General language features  Representations of actions  Action = PRECOND + EFFECT Action(Fly(p,from, to), PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬AT(p,from)  At(p,to)) = action schema (p, from, to need to be instantiated)  Action name and parameter list  Precondition (conj of function-free literals)  Effect (conj of function-free literals and P is True and not P is false)  Add-list vs delete-list in Effect 46

47 Language semantics?  How do actions affect states?  An action is applicable in any state that satisfies the precondition  For FO action schema applicability involves a substitution  for the variables in the PRECOND. At(P1,JFK)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO) Satisfies : At(p,from)  Plane(p)  Airport(from)  Airport(to) With  ={p/P1,from/JFK,to/SFO} Thus the action is applicable 47

48 Language semantics?  The result of executing action a in state s is the state s’  s’ is same as s except  Any positive literal P in the effect of a is added to s’  Any negative literal ¬P is removed from s’ EFFECT: ¬AT(p,from)  At(p,to): At(P1,SFO)  At(P2,SFO)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)  STRIPS assumption: (avoids representational frame problem) every literal NOT in the effect remains unchanged 48

49 Expressiveness and extensions  STRIPS is simplified  Important limit: function-free literals  Allows for propositional representation  Function symbols lead to infinitely many states and actions  Extension: Action Description language (ADL) Action(Fly(p:Plane, from: Airport, to: Airport), PRECOND: At(p,from)  (from  to) EFFECT: ¬At(p,from)  At(p,to)) Standardization : Planning domain definition language (PDDL) 49

50 Comparison of STRIPS and ADL STRIPS LanguageADL Language Only positive literals in states: Poor AND Unknown Positive and negative literals in states: NOT Rich AND NOT Famous Closed World Assumption: Unmentioned literals are false Open World Assumption: Unmentioned literals are unknown Effect P AND NOT Q means add P and delete Q Effect P AND NOT Q means add P and NOT Q and delete NOT P and Q Only ground literals in goals: Rich AND Famous Quantified variables in goals: EXISTS x At(P1, x) AND At(P2, x) is the goal of having P1 and P2 in the same place Goals are conjunctions: Rich AND Famous Goals allow conjunction and disjunction: NOT Poor AND (Famous OR Smart) Effects are conjunctionsConditional effects allowed: when P: E means E is an effect only if P is satisfied No support for equalityEquality predicate (x = y) is built in No support for typesVariables can have types, as in (p : Plane) 50

51 Example: air cargo transport Init(At(C1, SFO)  At(C2,JFK)  At(P1,SFO)  At(P2,JFK)  Cargo(C1)  Cargo(C2)  Plane(P1)  Plane(P2)  Airport(JFK)  Airport(SFO)) Goal(At(C1,JFK)  At(C2,SFO)) Action(Load(c,p,a) PRECOND: At(c,a)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: ¬At(c,a)  In(c,p)) Action(Unload(c,p,a) PRECOND: In(c,p)  At(p,a)  Cargo(c)  Plane(p)  Airport(a) EFFECT: At(c,a)  ¬In(c,p)) Action(Fly(p,from,to) PRECOND: At(p,from)  Plane(p)  Airport(from)  Airport(to) EFFECT: ¬ At(p,from)  At(p,to)) [Load(C1,P1,SFO), Fly(P1,SFO,JFK), Load(C2,P2,JFK), Fly(P2,JFK,SFO)] 51

52 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle))Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬At(Spare,Ground)) Action(LeaveOvernightPRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) ) This example goes beyond STRIPS: negative literal in pre-condition (ADL description) 52

53 Example: Blocks world Init(On(A, Table)  On(B,Table)  On(C,Table)  Block(A)  Block(B)  Block(C)  Clear(A)  Clear(B)  Clear(C)) Goal(On(A,B)  On(B,C)) Action(Move(b,x,y) PRECOND: On(b,x)  Clear(b)  Clear(y)  Block(b)  (b  x)  (b  y)  (x  y) EFFECT: On(b,y)  Clear(x)  ¬ On(b,x)  ¬ Clear(y)) Action(MoveToTable(b,x) PRECOND: On(b,x)  Clear(b)  Block(b)  (b  x) EFFECT: On(b,Table)  Clear(x)  ¬ On(b,x)) Spurious actions are possible: Move(B,C,C) 53

54 Planning with state-space search  Both forward and backward search possible  Progression planners  forward state-space search  Consider the effect of all possible actions in a given state  Regression planners  backward state-space search  To achieve a goal, what must have been true in the previous state 54

55 Regression  When we regress a goal through an operator, we are trying to determine what must be true before the operator is performed in order for the goal to be satisfied afterward 55

56 Progression and regression 56

57 Progression algorithm  Formulation as state-space search problem:  Initial state = initial state of the planning problem  Literals not appearing are false  Actions = those whose preconditions are satisfied  Add positive effects, delete negative  Goal test = does the state satisfy the goal  Step cost = each action costs 1  No functions … any graph search that is complete is a complete planning algorithm.  E.g. A*  Inefficient:  (1) irrelevant action problem  (2) good heuristic required for efficient search 57

58 Regression algorithm  How to determine predecessors?  What are the states from which applying a given action leads to the goal? Goal state = At(C1, B)  At(C2, B)  …  At(C20, B) Relevant action for first conjunct: Unload(C1,p,B) Works only if pre-conditions are satisfied. Previous state= In(C1, p)  At(p, B)  At(C2, B)  …  At(C20, B) Subgoal At(C1,B) should not be present in this state.  Actions must not undo desired literals (consistent)  Main advantage: only relevant actions are considered.  Often much lower branching factor than forward search 58

59 Regression Algorithm  General process for predecessor construction  Give a goal description G  Let A be an action that is relevant and consistent  The predecessors is as follows:  Any positive effects of A that appear in G are deleted.  Each precondition literal of A is added, unless it already appears.  Any standard search algorithm can be added to perform the search.  Termination when predecessor satisfied by initial state.  In FO case, satisfaction might require a substitution. 59

60 Heuristics for state-space search  Neither progression or regression are very efficient without a good heuristic.  How many actions are needed to achieve the goal?  Exact solution is NP hard, find a good estimate  Two approaches to find admissible heuristic:  The optimal solution to the relaxed problem.  Remove all preconditions from actions (heuristic roughly equal to number of unsatisfied goals)  The subgoal independence assumption: The cost of solving a conjunction of subgoals is approximated by the sum of the costs of solving the subproblems independently. 60

61 Partial-order planning  Progression and regression planning are totally ordered plan search forms.  They cannot take advantage of problem decomposition  Decisions must be made on how to sequence actions on all the subproblems  Least commitment strategy:  Delay choice during search 61

62 Shoe example Goal(RightShoeOn  LeftShoeOn) Init() Action(RightShoe,PRECOND: RightSockOn EFFECT: RightShoeOn) Action(RightSock,PRECOND: EFFECT: RightSockOn) Action(LeftShoe,PRECOND: LeftSockOn EFFECT: LeftShoeOn) Action(LeftSock,PRECOND: EFFECT: LeftSockOn) Planner: combine two action sequences (1)leftsock, leftshoe (2)rightsock, rightshoe 62

63 Partial-order planning (POP)  Any planning algorithm that can place two actions into a plan without specifying which comes first is a PO planner 63

64 POP as a search problem  States are (mostly unfinished) plans.  The empty plan contains only start and finish actions  Note that we are searching through the space of plans, not the space of world states 64

65 POP as a search problem  Each plan has 4 components: 1.A set of actions (steps of the plan) 2.A set of ordering constraints: A < B (A before B)  Cycles represent contradictions 3.A set of causal links  “A achieves p for B”; “p is an effect of the A action and a precondition of the B action”  The plan may not be extended by adding a new action C that conflicts with the causal link (if the effect of C is ¬p and if C could come after A and before B) 4.A set of open preconditions  Preconditions not achieved by any action in the plan 65

66 Example of final plan  Actions={Rightsock, Rightshoe, Leftsock, Leftshoe, Start, Finish}  Orderings={Rightsock < Rightshoe; Leftsock < Leftshoe}  Links={Rightsock->Rightsockon -> Rightshoe, Leftsock->Leftsockon-> Leftshoe, Rightshoe->Rightshoeon->Finish, …}  Open preconditions={} 66

67 POP as a search problem  A plan is consistent iff there are no cycles in the ordering constraints and no conflicts with the causal links  A consistent plan with no open preconditions is a solution  A partial order plan is executed by repeatedly choosing any of the possible next actions  This flexibility is a benefit in non-cooperative environments 67

68 Solving POP  Assume propositional planning problems:  The initial plan contains Start and Finish, the ordering constraint Start < Finish, no causal links, all the preconditions in Finish are open.  Successor function :  picks one open precondition p on an action B and  generates a successor plan for every possible consistent way of choosing action A that achieves p.  Test goal 68

69 Enforcing consistency  When generating successor plan:  The causal link A->p->B and the ordering constraint A p->B and the ordering constraint A < B is added to the plan.  If A is new, also add start < A and A < B and A < finish to the plan  Resolve conflicts between new causal link and all existing actions, e.g., C (put in ordering constraints that make C occur outside “protection interval”)  Resolve conflicts between action A (if new) and all existing causal links 69

70 Process summary  Operators on partial plans  Add link from existing plan to open precondition  Add a step to fulfill an open condition  Order one step w.r.t. another to remove possible conflicts  Gradually move from incomplete/vague plans to complete/correct plans  Backtrack if an open condition is unachievable or if a conflict is irresolvable 70

71 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle))Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬Ar(Spare,Ground)) Action(LeaveOvernightPRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) ) 71

72 Solving the problem  Initial plan: Start with EFFECTS and Finish with PRECOND 72

73 Solving the problem  Initial plan: Start with EFFECTS and Finish with PRECOND  Pick an open precondition: At(Spare, Axle)  Only PutOn(Spare, Axle) is applicable  Add causal link:  Add constraint : PutOn(Spare, Axle) < Finish 73

74 Solving the problem  Pick an open precondition: At(Spare, Ground)  Only Remove(Spare, Trunk) is applicable  Add causal link:  Add constraint : Remove(Spare, Trunk) < PutOn(Spare,Axle) 74

75 Solving the problem  Pick an open precondition: ¬At(Flat, Axle)  LeaveOverNight is applicable  conflict: LeaveOverNight also has the effect ¬ At(Spare,Ground)  To resolve, add constraint : LeaveOverNight < Remove(Spare, Trunk) 75

76 Solving the problem  Pick an open precondition: At(Spare, Trunk)  Only Start is applicable  Add causal link:  Conflict: of causal link with effect At(Spare,Trunk) in LeaveOverNight  No re-ordering solution possible, can’t put LeaveOvernight before Start  Backtrack 76

77 Solving the problem  Remove LeaveOverNight and causal links  Repeat step with Remove(Flat, Axle) satisfying NOT At(Flat, Axle)  Choose precondition of Remove(Spare, Trunk), satisfied by Start, then precondition of Remove(Flat, Axle), satisfied by Start --- finished 77

78 Some details …  What happens when a first-order representation that includes variables is used?  Complicates the process of detecting and resolving conflicts  Can be resolved by introducing inequality constraints (“there will only be a conflict if z = B”)  CSP’s most-constrained-variable constraint can be used for planning algorithms to select a PRECOND (select the open precondition that can be satisfied in the fewest number of ways) 78

79 Planning graphs  Used to achieve better heuristic estimates  A solution can also be directly extracted using GRAPHPLAN  Consists of a sequence of levels that correspond to time steps in the plan  Level 0 is the initial state  Each level consists of a set of literals and a set of actions  Literals = all those that could be true at that time step, depending upon the actions executed at the preceding time step  Actions = all those actions that could have their preconditions satisfied at that time step, depending on which of the literals actually hold 79

80 Planning graphs  “Could”?  Records only a restricted subset of possible negative interactions among actions  They work only for propositional problems  Example: Init(Have(Cake)) Goal(Have(Cake)  Eaten(Cake)) Action(Eat(Cake), PRECOND: Have(Cake) EFFECT: ¬Have(Cake)  Eaten(Cake)) Action(Bake(Cake), PRECOND: ¬ Have(Cake) EFFECT: Have(Cake)) 80

81 Cake example  Start at level S0 and determine action level A0 and next level S1.  A0 >> all actions whose preconditions are satisfied in the previous level  Connect precondition and effect of actions S0 --> S1  Inaction is represented by persistence actions (small squares)  Level A0 contains the actions that could occur  Conflicts between actions are represented by mutex links, in gray 81

82 Cake example  Level S1 contains all literals that could result from picking any subset of actions in A0  Conflicts between literals that cannot occur together (as a consequence of the selection action) are represented by mutex links  S1 defines multiple states and the mutex links are the constraints that define this set of states  Continue until two consecutive levels are identical: leveled off  E.g., contain the same number of literals 82

83 Cake example  A mutex relation holds between two actions when:  Inconsistent effects: one action negates the effect of another (Eat(Cake) and the persistence of Have(Cake) in A 0 have inconsistent effects, disagreeing on Have(Cake))  Interference: one of the effects of one action is the negation of a precondition of the other (Eat(Cake) negates the precondition of the persistence of Have(Cake))  Competing needs: one of the preconditions of one action is mutually exclusive with the precondition of the other (Bake(Cake) and Eat(Cake) compete on the value of the Have(Cake) precondition)  A mutex relation holds between two literals when (inconsistent support):  If one is the negation of the other OR  if each possible action pair that could achieve the literals is mutex (e.g., Have(Cake) and Eaten(Cake) in S 1 ) 83

84 PG and heuristic estimation  PG’s provide information about the problem  A literal that does not appear in the final level of the graph cannot be achieved by any plan  Useful for backward search (cost = infinity)  Level of appearance can be used as cost estimate of achieving any goal literals = level cost  Small problem: several actions can occur at one level, but heuristic counts levels, not actions  Restrict to one action at any given time step using serial PG (add mutex links between every pair of actions, except persistence actions) 84

85 PG and heuristic estimation  Cost of a conjunction of goals?  Max-level (admissable, but inaccurate),  level-sum (inadmissable but works well in practice), and  set-level (the level at which all conjuncts appear without any pair being mutually exclusive) heuristics PG is a relaxed problem (if a literal does not appear, it can’t be achieved at that level, but if it does appear, maybe it can be achieved at that level). 85

86 The GRAPHPLAN Algorithm  Extract a solution directly from the PG  Two main steps, which alternate within a loop:  Check whether all goal literals are present in the current level with no mutex links between any pair of them; if so, try and extract a solution  Otherwise, expand graph by adding actions and state literals for the next level  Process continues until either solution is found or it is found that no solution exists 86

87 The GRAPHPLAN Algorithm function GRAPHPLAN(problem) return solution or failure graph  INITIAL-PLANNING-GRAPH(problem) goals  GOALS[problem] loop do if goals all non-mutex in last level of graph then do solution  EXTRACT-SOLUTION(graph, goals, LENGTH(graph)) solution  EXTRACT-SOLUTION(graph, goals, LENGTH(graph)) if solution  failure then return solution if solution  failure then return solution else if NO-SOLUTION-POSSIBLE(graph) then return failure else if NO-SOLUTION-POSSIBLE(graph) then return failure graph  EXPAND-GRAPH(graph, problem) graph  EXPAND-GRAPH(graph, problem) 87

88 Example: Spare tire problem Init(At(Flat, Axle)  At(Spare,trunk)) Goal(At(Spare,Axle))Action(Remove(Spare,Trunk) PRECOND: At(Spare,Trunk) EFFECT: ¬At(Spare,Trunk)  At(Spare,Ground)) Action(Remove(Flat,Axle) PRECOND: At(Flat,Axle) EFFECT: ¬At(Flat,Axle)  At(Flat,Ground)) Action(PutOn(Spare,Axle) PRECOND: At(Spare,Groundp)  ¬At(Flat,Axle) EFFECT: At(Spare,Axle)  ¬At(Spare,Ground)) Action(LeaveOvernightPRECOND: EFFECT: ¬ At(Spare,Ground)  ¬ At(Spare,Axle)  ¬ At(Spare,trunk)  ¬ At(Flat,Ground)  ¬ At(Flat,Axle) ) 88

89 GRAPHPLAN example  Initially the plan consist of 5 literals, 2 from the initial state and the 3 CWA literals (S 0 )  Add actions whose preconditions are satisfied by EXPAND-GRAPH (A 0 )  Also add persistence actions and mutex relations  Add the effects at level S 1  Repeat until goal is in level S i 89

90 GRAPHPLAN example  EXPAND-GRAPH also looks for mutex relations  Inconsistent effects  E.g. Remove(Spare, Trunk) and LeaveOverNight due to At(Spare,Ground) and not At(Spare, Ground)  Interference  E.g. Remove(Flat, Axle) and LeaveOverNight At(Flat, Axle) as PRECOND and not At(Flat,Axle) as EFFECT  Competing needs  E.g. PutOn(Spare,Axle) and Remove(Flat, Axle) due to At(Flat.Axle) and not At(Flat, Axle)  Inconsistent support  E.g. in S2, At(Spare,Axle) and At(Flat,Axle) 90

91 GRAPHPLAN example  In S2, the goal literals exist and are not mutex with any other  Solution might exist and EXTRACT-SOLUTION will try to find it  EXTRACT-SOLUTION can use Boolean CSP to solve the problem or a search process (variables are actions at each level, their values are in or out of the plan):  Initial state = last level of PG and goal goals of planning problem  Actions = select any set of non-conflicting actions that cover the goals in the state  Goal = reach level S 0 such that all goals are satisfied  Cost = 1 for each action 91

92 GRAPHPLAN example  Termination? YES  PG are monotonically increasing or decreasing:  Literals increase monotonically (persistence actions keep them going…)  Actions increase monotonically (since their [literal] preconditions keep appearing)  Mutexes decrease monotonically (proof in Russell…)  Because of these properties and because there is a finite number of actions and literals, every PG will eventually level off ! 92

93 Planning with propositional logic  Planning can be done by proving a theorem in situation calculus.  Here: test the satisfiability of a logical sentence:  Sentence contains propositions for every action occurrence.  A model will assign true to the actions that are part of the correct plan and false to the others  An assignment that corresponds to an incorrect plan will not be a model because of inconsistency with the assertion that the goal is true.  If the planning is unsolvable the sentence will be unsatisfiable. 93

94 SATPLAN algorithm function SATPLAN(problem, T max ) return solution or failure inputs: problem, a planning problem T max, an upper limit to the plan length T max, an upper limit to the plan length for T= 0 to T max do cnf, mapping  TRANSLATE-TO_SAT(problem, T) cnf, mapping  TRANSLATE-TO_SAT(problem, T) assignment  SAT-SOLVER(cnf) assignment  SAT-SOLVER(cnf) if assignment is not null then return EXTRACT-SOLUTION(assignment, mapping) return EXTRACT-SOLUTION(assignment, mapping) return failure 94

95 cnf, mapping  TRANSLATE-TO_SAT(problem, T)  Distinct propositions for assertions about each time step  Superscripts denote the time step At(P1,SFO) 0  At(P2,JFK) 0  No CWA, thus specify which propositions are not true ¬At(P1,JFK) 0  ¬At(P2,SFO) 0  Unknown propositions are left unspecified  The goal is associated with a particular time-step  But which one? 95

96 cnf, mapping  TRANSLATE-TO_SAT(problem, T)  How to determine the time step where the goal will be reached?  Start at T=0  Assert At(P1,JFK) 0  At(P2, SFO) 0  Failure.. Try T=1  Assert At(P1,JFK) 1  At(P2, SFO) 1  …  Repeat this until some minimal path length is reached.  Termination is ensured by T max 96

97 cnf, mapping  TRANSLATE-TO_SAT(problem, T)  How to encode actions into PL?  Propositional versions of successor-state axioms At(P1,JFK) 1  (At(P1,JFK) 0  ¬(Fly(P1,JFK,SFO) 0  At(P1,JFK) 0 ))  (Fly(P1,SFO,JFK) 0  At(P1,SFO) 0 )  Such an axiom is required for each plane, airport and time step  If more airports add another way to travel than additional disjuncts are required  Once all these axioms are in place, the satisfiability algorithm can start to find a plan. 97

98 assignment  SAT-SOLVER(cnf)  Multiple models can be found  They are NOT satisfactory: (for T=1) Fly(P1,SFO,JFK) 0  Fly(P1,JFK,SFO) 0  Fly(P2,JFK.SFO) 0 The second action is infeasible Yet the plan IS a model of the sentence  Avoiding illegal actions: pre-condition axioms Fly(P1,SFO,JFK) 0  At(P1,JFK)  Exactly one model now satisfies all the axioms where the goal is achieved at T=1. 98

99 assignment  SAT-SOLVER(cnf)  A plane can fly to two destinations at once  They are NOT satisfactory: (for T=1) Fly(P1,SFO,JFK) 0  Fly(P2,JFK,SFO) 0  Fly(P2,JFK.LAX) 0 The second action is infeasible Yet the plan allows spurious relations  Avoid spurious solutions: action-exclusion axioms ¬( Fly(P2,JFK,SFO) 0  Fly(P2,JFK,LAX) 0 ) Prevents simultaneous actions  Lost flexibility, since plan becomes totally ordered: no actions are allowed to occur at the same time.  Restrict exclusion to preconditions (i.e., two actions cannot occur simultaneously if one negates a precondition or effect of the other) 99

100 Planning in the Real World (Chapter 12, “Artificial Intelligence: A Modern Approach”)

101 Outline  Time, schedules and resources  Hierarchical task network planning  Non-deterministic domains  Conditional planning  Execution monitoring and replanning  Continuous planning  Multi-agent planning 101

102 Time, schedules and resources  Until now:  what actions to do  Real-world:  + actions occur at certain moments in time.  + actions have a beginning and an end.  + actions take a certain amount of time.  Job-shop scheduling:  Complete a set of jobs, each of which consists of a sequence of actions,  Where each action has a given duration and might require resources.  Determine a schedule that minimizes the total time required to complete all jobs (respecting resource constraints). 102

103 Car construction example Init(Chassis(C1)  Chassis(C2)  Engine(E1,C1,30)  Engine(E1,C2,60)  Wheels(W1,C1,30)  Wheels(W2,C2,15)) Goal(Done(C1)  Done(C2)) Action(AddEngine(e,c,m) PRECOND: Engine(e,c,d)  Chassis(c)  ¬EngineIn(c) EFFECT: EngineIn(c)  Duration(d)) Action(AddWheels(w,c) PRECOND: Wheels(w,c,d)  Chassis(c) EFFECT: WheelsOn(c)  Duration(d)) Action(Inspect(c) PRECOND: EngineIn(c)  WheelsOn(c)  Chassis(c) EFFECT: Done(c)  Duration(10)) 103

104 Solution found by POP Slack of 15 critical path 104

105 Planning vs. scheduling  How does the problem differ from a standard planning problem?  When does an action start and when does it end?  So in addition to ordering (planning), duration is also considered Duration(d)  Critical path method is used to determine start and end times:  Path = linear sequence from start to end  Critical path = path with longest total duration  Determines the duration of the entire plan  Critical path should be executed without delay 105

106 ES and LS  Earliest possible (ES) and latest possible (LS) start times.  LS – ES = slack of an action  For all actions determines the schedule for the entire problem ES(Start) = 0 ES(B)=max A<B ES(A) + Duration(A) LS(Finish)=ES(Finish) LS(A) = min A<B LS(B) -Duration(A)  Complexity is O(Nb) (given a partial order), where N is the number of actions and b is the maximum branching factor into or out of an action 106

107 Scheduling with resources  Resource constraints = required material or objects to perform task  Reusable resources  A resource that is occupied during an action but becomes available when the action is finished  Requires extension of action syntax: Resource:R(k)  k units of resource are required by the action  Is a pre-requisite before the action can be performed  Resource cannot be used for k time units by others 107

108 Car example with resources Init(Chassis(C1)  Chassis(C2)  Engine(E1,C1,30)  Engine(E1,C2,60)  Wheels(W1,C1,30)  Wheels(W2,C2,15)  EngineHoists(1)  WheelStations(1)  Inspectors(2)) Goal(Done(C1)  Done(C2)) Action(AddEngine(e,c,m) PRECOND: Engine(e,c,d)  Chassis(c)  ¬EngineIn(c) EFFECT: EngineIn(c)  Duration(d), RESOURCE: EngineHoists(1)) Action(AddWheels(w,c) PRECOND: Wheels(w,c,d)  Chassis(c) EFFECT: WheelsOn(c)  Duration(d) RESOURCE: WheelStations(1)) Action(Inspect(c) PRECOND: EngineIn(c)  WheelsOn(c)  Chassis(c) EFFECT: Done(c)  Duration(10) RESOURCE: Inspectors(1)) aggregation 108

109 Car example with resources 109

110 Scheduling with resources  Aggregation = group individual objects into quantities when the objects are undistinguishable with respect to their purpose  Reduces complexity  Resource constraints make scheduling problems more complicated  Additional interactions among actions  Heuristic: minimum slack algorithm  Select an action with all predecessors scheduled and with the least slack for the earliest possible start 110

111 Hierarchical task network planning  Reduce complexity  hierarchical decomposition  At each level of the hierarchy a computational task is reduced to a small number of activities at the next lower level  The computational cost of arranging these activities is low  Hierarchical task network (HTN) planning uses a refinement of actions through decomposition  e.g., building a house = getting a permit + hiring a contractor + doing the construction + paying the contractor  Refined until only primitive actions remain  Pure and hybrid HTN planning 111

112 Representation decomposition  General descriptions are stored in plan library.  Each method = Decompose(a, d); a= action and d = PO plan  See buildhouse example (below)  Start action supplies all preconditions of actions not supplied by other actions =external preconditions  Finish action has all effects of actions not present in other actions =external effects  Primary effects (used to achieve goal) vs. secondary effects 112

113 Buildhouse example External precond External effects 113

114 Buildhouse example Action(Buyland, PRECOND: Money, EFFECT: Land  ¬Money) Action(GetLoan, PRECOND: Goodcredit, EFFECT: Money  Mortgage) Action(BuildHouse, PRECOND: Land, EFFECT: House) Action(GetPermit, PRECOND: Land, EFFECT: Permit) Action(HireBuilder, EFFECT: Contract) Action(Construction, PRECOND: Permit  Contract, EFFECT: HouseBuilt  ¬Permit), Action(PayBuilder, PRECOND: Money  HouseBuilt, EFFECT: ¬Money  House  ¬ Contract), Decompose(BuildHouse, Plan ::STEPS{ S1: GetPermit, S2:HireBuilder, S3:Construction, S4 PayBuilder} ORDERINGS: {Start < S1 < S3< S4<Finish, Start<S2<S3}, LINKS 114

115 Properties of decomposition  Should be correct implementation of action a  Correct if plan d is complete and consistent PO plan for the problem of achieving the effects of a given the preconditions of a  A decomposition is not necessarily unique  Performs information hiding:  STRIPS action description of higher-level action hides some preconditions and effects  Ignore all internal effects of decomposition  Does not specify the intervals inside the activity during which preconditions and effects must hold.  Information hiding is essential to HTN planning 115

116 Recapitulation of POP (1)  Assume propositional planning problems:  The initial plan contains Start and Finish, the ordering constraint Start < Finish, no causal links, all the preconditions in Finish are open.  Successor function:  picks one open precondition p on an action B and  generates a successor plan for every possible consistent way of choosing action A that achieves p.  Test goal 116

117 Recapitulation of POP (2)  When generating successor plan:  The causal link A--p->B and the ordering constraing A B and the ordering constraing A < B is added to the plan.  If A is new also add start < A and A < B to the plan  Resolve conflicts between new causal link and all existing actions  Resolve conflicts between action A (if new) and all existing causal links. 117

118 Adapting POP to HTN planning  Remember POP?  Modify the successor function: apply decomposition to current plan  NEW Successor function:  Select non-primitive action a’ in P  For any Decompose(a’,d’) method in library where a and a’ unify with substitution   Replace a’ with d’ = subst( ,d) 118

119 POP+HTN example a’ 119

120 POP+HTN example a’ d 120

121 How to hook up d in a’?  Remove action a’ from P and replace with d   For each step s in d’ select an action that will play the role of s (either new s or existing s’ from P)  Possibility of subtask sharing  Connect ordering steps for a’ to the steps in d’  Put all constraints so that constraints of the form B < a’ are maintained  Watch out for too strict orderings!  Connect the causal links  If B -p-> a’ is a causal link in P, replace it by a set of causal links from B to all steps in d’ with preconditions p that were supplied by the start step  Same thing for a’ -p-> C 121

122 What about HTN?  Additional modifications to POP are necessary  BAD news: pure HTN planning is undecidable due to recursive decomposition actions  Walk = take one step, then Walk…  Resolve problems by  Ruling out recursion  Bound the length of relevant solutions  Hybridize HTN with POP  Yet HTN can be efficient (see motivations in book) 122

123 The Gift of magi 123

124 Non-deterministic domains  So far: fully observable, static and deterministic domains.  Agent can plan first and then execute plan with eyes closed  Uncertain environment: incomplete (partially observable and/or nondeterministic) and incorrect (differences between world and model) information  Use percepts  Adapt plan when necessary  Degree of uncertainty defined by indeterminacy  Bounded: actions can have unpredictable effects, yet can be listed in action description axioms  Unbounded: preconditions and effects unknown or too large to enumerate 124

125 Handling indeterminacy  Sensorless planning (conformant planning)  Find plan that achieves goal in all possible circumstances (regardless of initial state and action effects)  Conditional planning (contingency planning)  Construct conditional plan with different branches for possible contingencies  Execution monitoring and replanning  While constructing plan judge whether plan requires revision  Continuous planning  Planning active for a life time: adapt to changed circumstances and reformulate goals if necessary 125

126 Sensorless planning 126

127 Abstract example  Initial state =, goal state =  Initial state =, goal state =  Sensorless planning (conformant planning)  Open any can of paint and apply it to both chair and table  Conditional planning (contingency planning)  Sense color of table and chair, if they are the same then finish; else sense label’s paint; if color(label) = color(Furniture) then apply color to other piece; else apply color to both  Execution monitoring and replanning  Same as conditional and can fix errors (missed spots)  Continuous planning  Can revise goal when we want to first eat before painting the table and the chair 127

128 Conditional planning  Deal with uncertainty by checking the environment to see what is really happening  Used in fully observable and nondeterministic environments:  The outcome of an action is unknown  Conditional steps will check the state of the environment  How to construct a conditional plan? 128

129 Example, the vacuum-world (fully observable, but here deterministic) 129

130 Conditional planning  Actions: left, right, suck  Propositions to define states: AtL, AtR, CleanL, CleanR  How to include indeterminism?  Actions can have more than one effect  E.g., moving left sometimes fails Action(Left, PRECOND: AtR, EFFECT: AtL) Becomes: Action(Left, PRECOND: AtR, EFFECT: AtL  AtR)  Actions can have conditional effects (vacuum cleaner sometimes dumps dirt when moved) Action(Left, PRECOND: AtR, EFFECT: AtL  (AtL  when cleanL: ¬cleanL) Both disjunctive and conditional (“when cleanL” refers to before the action is taken, “¬cleanL” refers to after action) 130

131 Conditional planning  Conditional plans require conditional steps:  If then plan_A else plan_B if AtL  CleanL then Right else Suck  Plans become trees  Games against nature:  Find conditional plans that work regardless of which action outcomes actually occur  Assume vacuum-world Initial state = AtR  CleanL  CleanR Would be easy, except that we have a Double Murphy vacuum cleaner: possibility of depositing dirt when moving to clean other square, and possibility of depositing dirt when action is Suck in a clean square 131

132 Game tree State node chance node [Left, if AtL  CleanL  CleanR then [ ] else Suck] 132

133 Solution of games against Nature  Solution is a subtree that  Has a goal node at every leaf  Specifies one action at each of its state nodes  Includes every outcome branch at each of the chance nodes.  In previous example: [Left, if AtL  CleanL  CleanR then [ ] else Suck]  For exact solutions: use minimax algorithm with 2 modifications:  Max and Min nodes become OR and AND nodes  Algorithm returns conditional plan instead of single move  At an OR node, the plan is the action selected  At an AND node, the plan is a nested series of if-then- else steps (or “cases”) specifying subplans for each outcome (with tests being complete state descriptions) 133

134 And-Or-search algorithm (recursive, depth-first, on graph) function AND-OR-GRAPH-SEARCH(problem) returns a conditional plan or failure return OR-SEARCH(INITIAL-STATE[problem], problem, [ ]) function AND-SEARCH(state_set, problem, path) returns a conditional plan or failure for each s i in state_set do plan i  OR-SEARCH(s i, problem,path ) if plan = failure then return failure return [ if s 1 then plan 1 else if s 2 then plan 2 else … if s n-1 then plan n-1 else plan n ] function OR-SEARCH(state, problem, path) returns a conditional plan or failure if GOAL-TEST[problem](state) then return the empty plan if state is on path then return failure for action,state_set in SUCCESSORS[problem](state) do plan  AND-SEARCH(state_set, problem, [state | plan] ) if plan  failure then return [action | plan] return failure “discards” repeated states in graph, like the “Loop” state in vacuum cleaner game tree 134

135 And-Or-search algorithm  How does it deal with cycles?  When a state already on the path appears, return failure  If there is a non-cyclic solution, it is reachable from the earlier incarnation of the state  Ensures algorithm termination in every finite state space, since every path must reach a goal, a dead end, or a repeated state  But the algorithm does not check whether some state is already on some other path from the root 135135135

136 And-Or-search algorithm  Sometimes only a cyclic solution exists  E.g., triple Murphy: sometimes the move is not performed [Left, if CleanL then [] else Suck] is not a solution  Our And-Or-search algorithm would fail  Modification: Use label to repeat parts of plan (but now there may be infinite loops) [L1: Left, if AtR then L1 else if CleanL then [] else Suck] Search graph for the “triple Murphy” world, Left doesn’t always move the vacuum cleaner 136136136

137 Conditional Planning and partially observable enironments  Fully observable: conditional tests can ask any question and get an answer  Partially observable???  The agent has limited information about the environment  Modeled by a state-set = belief states  E.g., assume vacuum agent which cannot sense presence or absence of dirt in squares other than the one it’s on  + alternate double Murphy: dirt can be left behind when agent leaves a clean square  Solution in fully observable world: keep moving left and right, sucking dirt whenever it appears until both squares are clean and I’m in square left  But “both squares a clean” can’t be tested now… 137

138 Partially Observable: alternate double Murphy This relates back to our possible worlds formalization of knowledge; e.g., at top, agent might be in two possible worlds, because all he knows is AtR and CleanR 138

139 Belief states  Three Choices for Representation  Sets of full state descriptions {(AtR  CleanR  CleanL)  (AtR  CleanR  CleanL)}  Logical sentences that capture the set of possible worlds in the belief state (Open World Assumption, unknown propositions are left unspecified) AtR  CleanR  Knowledge propositions describing the agent’s knowledge (Closed World Assumption, unknown knowledge propositions are assumed false) K(AtR)  K(CleanR) 139

140 Belief states  Choices 2 and 3 are roughly equivalent (let’s continue with 3)  Symbols can appear in three ways: positive, negative or unknown: 3 n possible belief states for n proposition symbols.  YET, set of belief sets is a powerset (set of all subsets) of the physical states: 2 n physical states, so 2 2 n belief states --- which is much larger than 3 n  Hence 3 is restricted as representation Any scheme capable of representing every possible belief state will require O(2 n ) bits to represent each one in the worst case The current scheme only requires O(n) --- trading expressiveness for compactness 140

141 Sensing in Conditional Planning  How does it work?  Automatic sensing At every time step the agent gets all available percepts  Active sensing Percepts are obtained through the execution of specific sensory actions. checkDirt and checkLocation  Given the representation and the sensing, action descriptions for checkDirt and checkLocation can now be formulated (see Russell and Norvig) 141

142 Monitoring and replanning  Execution monitoring: check whether everything is going as planned  Unbounded indeterminancy: some unanticipated circumstances will arise  A necessity in realistic environments  Kinds of monitoring:  Action monitoring: verify whether the next action will work  Plan monitoring: verify the entire remaining plan 142

143 Monitoring and replanning  When something unexpected happens: replan  To avoid too much time on planning try to repair the old plan  Can be applied in both fully and partially observable environments, and to a variety of planning representations 143

144 Replanning-agent function REPLANNING-AGENT(percept) returns an action static: KB, a knowledge base (+ action descriptions) plan, a plan initially [] whole_plan, a plan initially [] goal, a goal TELL(KB, MAKE-PERCEPT-SENTENCE(percept,t)) current  STATE-DESCRIPTION(KB,t) if plan = [] then return the empty plan whole_plan  plan  PLANNER(current, goal, KB) if PRECONDITIONS(FIRST(plan)) not currently true in KB then candidates  SORT(whole_plan,ordered by distance to current) find state s in candidates such that failure  repair  PLANNER(current, s, KB) continuation  the tail of whole_plan starting at s whole_plan  plan  APPEND(repair, continuation) return POP(plan) 144

145 Repair example Moving along whole-plan from S to E, agent discovers it is in state O; creates repair plan to get back to some state P on the original whole-plan; the new plan is the concatenation of repair with continuation (resumption of original whole-plan) 145

146 Repair example: painting Init(Color(Chair, Blue)  Color(Table,Green)  ContainsColor(BC,Blue)  PaintCan(BC)  ContainsColor(RC,Red)  PaintCan(RC)) Goal(Color(Chair,x)  Color(Table,x)) Action(Paint(object, color) PRECOND: HavePaint(color) EFFECT: Color(object, color)) Action(Open(can) PRECOND: PaintCan(can)  ContainsColor(can,color) EFFECT: HavePaint(color)) Original plan: [Start; Open(BC); Paint(Table,Blue), Finish] 146

147 Repair example: painting  Suppose that the agent executes plan and then perceives that the colors of table and chair are different  Figure out point in whole_plan to aim for Current state is identical as the precondition before Paint  Repair action sequence to get there Repair = [ ] and plan = [Paint, Finish]  Continue performing this new plan Will loop until table and chair are perceived as the same  Action monitoring can lead to less intelligent behavior  Assume the red is selected and there is not enough paint to apply to both chair and table; failure is declared only after painting chair  Improved by doing plan monitoring 147

148 Plan monitoring  Check the preconditions for success of the entire plan  (Except those that are achieved by another step in the plan)  Execution of doomed plan is cut off earlier  Limitation of replanning agent:  It cannot formulate new goals or accept new goals in addition to the current one 148

149 Continuous planning  Agent persists indefinitely in an environment  Phases of goal formulation, planning and acting  Execution monitoring + planner as one continuous process  Example: Blocks world  Assume a fully observable environment  Assume partially ordered plan 149

150 Blocks world example  Initial state (a)  Action(Move(x,y), PRECOND: Clear(x)  Clear(y)  On(x,z) EFFECT: On(x,y)  Clear(z)   On(x,z)   Clear(y)  The agent first needs to formulate a goal: On(C,D)  On(D,B)  Plan is created incrementally, return NoOp after each increment (as the plan is being developed) and check percepts again 150150150

151 Blocks world example  Assume that percepts don’t change and this plan is constructed  Ordering constraint between Move(D,B) and Move(C,D)  “Start” is always the label of the current state during planning  Before the agent can execute the plan, nature intervenes: D is moved onto B 151151151

152 Blocks world example  Start contains now On(D,B)  Agent perceives: Clear(B) and On(D,G) are no longer true  Update model of current state (start)  Causal links from Start to Move(D,B) (Clear(B) and On(D,G)) no longer valid  Remove causal relations and two PRECOND of Move(D,B) are open  Replace action and causal links to Finish by connecting Start to Finish 152152152

153 Blocks world example  Extending: whenever a causal link can be supplied by a previous step  All redundant steps (Move(D,B) and its causal links) are removed from the plan  Execute new plan, perform action Move(C,D)  This removes the step from the plan Extending causal link 153153153

154 Blocks world example  Execute new plan, perform action Move(C,D)  Assume agent is clumsy and drops C on A  No plan but still an open PRECOND  Determine new plan for open condition  Again Move(C,D)  The plan would look similar to the previous slide, but with new bindings for the preconditions, since On(C, A) 154154154

155 Blocks world example  Similar to Partial Order Planning (POP)  On each iteration find plan-flaw and fix it  POP deals with open preconditions and causal conflicts  Possible Continuous Planning flaws: Missing goal, Open precondition, Causal conflict, Unsupported link, Redundant action, Unexecuted action, Unnecessary historical goal 155155155

156 Multi-agent planning  So far we’ve only discussed single-agent environments  Other agents can simply be added to the model of the world:  Poor performance, since agents are not indifferent to other agents’ intentions  In general two types of multi-agent environments:  Cooperative  Competitive 156

157 Cooperation: Joint goals and plans  Multi-planning problem: assume double tennis example where agents want to return ball Agents(A,B) Init(At(A,[Left,Baseline])  At(B,[Right, Net])  Approaching(Ball,[Right, Baseline])  Partner(A,B)  Partner(B,A)) Goal(Returned(Ball)  At(agent,[x,Net])) Action(Hit(agent, Ball) PRECOND: Approaching(Ball,[x,y])  At(agent,[x,y])  Partner(agent, partner)   At(partner,[x,y]) EFFECT: Returned(Ball)) Action(Go(agent,[x,y]) PRECOND: At(agent,[a,b]) EFFECT: At(agent,[x,y])   At(agent,[a,b])) 157

158 Cooperation: Joint goals and plans  A solution is a joint-plan consisting of actions for both agents  Example: A: [Go(A,[Right, Baseline]), Hit(A,Ball)] B: [NoOp(B), NoOp(B)] Or A: [Go(A,[Left, net), NoOp(A)] B: [Go(B,[Right, Baseline]), Hit(B, Ball)]  Coordination is required to reach same joint plan 158

159 Multi-body planning  Planning problem faced by a single centralized agent that can dictate action to each of several physical entities  Hence not truly multi-agent  Important: synchronization of actions  Assume for simplicity that every action takes one time step and at each point in the joint plan the actions are performed simultaneously [ ; ] ]  Planning can be performed using POP applied to the set of all possible joint actions  Size of this set --- e.g., 10 actions, 5 agents, 10 5 joint actions  Difficult to specify so many actions, inefficient to plan with them 159

160 Multi-body planning  Alternative to set of all joint actions: add extra concurrency lines to action description (actions that must or must not be executed concurrently)  Action with prohibited concurrency Action(Hit(A, Ball) CONCURRENT:  Hit(B,Ball) PRECOND: Approaching(Ball,[x,y])  At(A,[x,y]) EFFECT: Returned(Ball))  Action with required concurrency (carrying object by two agents) Action(Carry(A, cooler, here, there) CONCURRENT: Carry(B, cooler, here there) PRECOND: …)  Planner similar to POP with some small changes in possible ordering relations (e.g., requiring concurrent actions in partial order) 160

161 Coordination mechanisms  To ensure agreement on joint plan: use conventions  Convention = a constraint on the selection of joint plans (beyond the constraint that the joint plan must work if the agents adopt it) E.g., stick to your court or one player stays at the net  Conventions that are widely adopted = social laws, e.g., language, driving on right side of street  Can be domain-specific or independent  Could arise through evolutionary process (flocking behavior) 161

162 Flocking example  Three rules of “boids”:  Separation: Steer away from neighbors when you get too close  Cohesion Steer toward the average position of neighbors  Alignment Steer toward average orientation (heading) of neighbors  Flock exhibits emergent behavior of flying as a pseudo-rigid body 162

163 Boids Animation http://www.red3d.com/cwr/boids/ 163

164 Coordination mechanisms  In the absence of conventions: Communication E.g., “Mine!” or “Yours!” in tennis example  The burden of arriving at a successful joint plan can be placed on  Agent designer (agents are reactive, no explicit models of other agents)  Agent (agents are deliberative, model of other agents required) 164

165 Competitive environments  Agents can have conflicting utilities E.g., zero-sum games like chess  The agent must:  Recognize that there are other agents  Compute some of the other agents plans  Compute how the other agents interact with its own plan  Decide on the best action in view of these interactions  Model of other agent is required  YET, no commitment to joint action plan  Game theoretic notions are useful 165


Download ppt "Introduction to Artificial Intelligence – Unit 6 Planning Course 67842 The Hebrew University of Jerusalem School of Engineering and Computer Science Academic."

Similar presentations


Ads by Google