Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 10 Mid-term Review II. Topics Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2.

Similar presentations


Presentation on theme: "Section 10 Mid-term Review II. Topics Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2."— Presentation transcript:

1 Section 10 Mid-term Review II

2 Topics

3 Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2. brew(x) precond:loaded(x), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(x), loaded(waste), pot(x) 3. unload(x) precond: loaded(x), ¬loaded(none) effects: ¬loaded(x), loaded(none) Two types of coffee: caf & decaf; waste; none Initial state: coffee(caf), coffee(decaf), loaded(none) Goal state: pot(caf), pot(decaf)

4 Graphplan! (Problem 1) Graphplan works only for propositional planning problems! Core elements: Expand-Graph keep track of mutex action and propositions Extract-Solution

5 Propositionalize the PDDL Eliminate variables by replacing them with constant symbols Example of propositionalized fluents: loadedCaf:loaded(caf) Example of propositionalized actions: brewCaf:brew(caf) precond: loaded(caf), ¬loaded(none), ¬loaded(waste) effects: ¬loaded(caf), loaded(waste), pot(caf) Propositionalized initial state: coffeeCaf, coffeeDecaf, loadedNone Propositionalized goal state: potCaf, potDecaf

6 Expand the Graph coffeeCaf coffeeDecaf loadedNone loadCaf loadDecaf coffeeCaf coffeeDecaf loadedNone coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone P 0 A 1 P 1

7 Keep track of the Mutex Mutex actions Not independent Action A deletes Action B’s precondition Action A deletes Action B’s positive effect Any of the precondition pairs are mutex Mutex propositions All producer pairs are mutex

8 Mutex Actions and Propositions Mutex actions in A 1 : (loadCaf, loadDecaf) (loadCaf, loadedNone) (loadDecaf, loadedNone) Mutex propositions in P 1 : (loadedCaf, loadedDecaf) (loadedCaf, loadedNone) (loadedDecaf, loadedNone) (loadedNone, ¬loadedNone)

9 Continue Expand the Graph brewCaf loadDecaf coffeeCaf coffeeDecaf loadedNone loadCaf brewDecaf unloadCaf unloadDecaf loadedCaf loadedDecaf PotCaf PotDecaf coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedWaste loadedNone ¬loadedNone coffeeCaf coffeeDecaf loadedNone loadCaf loadDecaf coffeeCaf coffeeDecaf loadedNone coffeeCaf loadedCaf coffeeDecaf loadedDecaf loadedNone ¬loadedNone

10 Extract Solution Graphplan starts to extract solution iff All goal state fluents appear in a proposition level None of the goal state fluent pairs is mutex Extract the solution Graphplan gives you a valid plan, but not necessarily an optimal one (with the minimum number of actions) Multiple actions can take place in one action level!

11 Partial-Order Planning (Problem 2) Causal links Action A: Action B: precond:…precond:p(y), … effects:p(x), …effects:… A—p—B! Threats Action C: precond:… effects: ¬p(z), … C is a threat to the A—B causal link!

12 Causal Links and Threats Causal Links Example load(x)—loaded(x)—brew(x) Threats Example unload(x) could be a threat to the causal link above!

13 Demotion and Promotion A—p(x)—B, C is a threat to this causal link Demotion:C—A—B Promotion:A—B—C load(x)—loaded(x)—brew(x) is a causal link, unload(x) is a threat to this causal link Demotion:unload(x 1 )—load(x 2 )—brew(x 3 ) possible variable bindings:x 1 =waste, x 2 = x 3 =decaf Promotion:load(x 1 )—brew(x 2 )—unload(x 3 ) possible variable bindings:x 1 = x 2 =decaf, x 3 =waste

14 HTN (Problem 3) Serve_two_things(t) task:serve_coffee_and_cake(t) precond:table(t) subtasks:serve(coffee,t), serve(cake,t) Serve_coffee(x, t) task:serve(x,t) precond:coffee(x), table(t) subtasks:make-coffee(x), move(x, t) Serve_cake(x, t) task:serve(x,t) precond:cake(x), table(t) subtasks:make-cake(x), move(x, t)

15 HTN(cont’d) Make-Caf-Coffee(x, b, m) task:make-coffee(x) precond:bean(b), caf-bean(b), coffee-maker(m), coffee(x) subtasks:load(b, m), brew(b, m, x) Make-Decaf-Coffee(x, b, m) task:make-coffee(x) precond:bean(b), decaf-bean(b), coffee-maker(m), coffee(x) subtasks:load(b, m), brew(b, m, x) Load(b, m) [Primitive task!] precond:bean(b), coffee-maker(m), unloaded(m) effects:loaded(b, m) Brew(b, m, x) [Primitive task!] precond:loaded(b, m), bean(b), coffee-maker(m) effects:coffee(x), in(x, m)

16 serve_coffee_and_cake (t 0 ) Serve_two_things(t 0 ) table(t 0 ) serve(coffee) serve(cake) Serve_coffee(coffee, t 0 ) coffee(coffee), table(t 0 ) make-coffee(coffee) move(coffee, t 0 ) Make-Caf-Coffee(coffee, caf-bean, machine) Make-Decaf-Coffee(coffee, decaf-bean, machine)

17 MDP (Problem 4) You are making a three-year investment plan now. After your research, you find there are two companies which you’re interested in investing: Boston Medicine and San Francisco Chips. Currently the stock price is $10 per share for Boston Medicine and $12 per share for San Francisco Chips. At the beginning of each year, you will decide which company to invest in, and once you make the decision, you will buy 1000 shares from that company. At the end of each year, you will earn / loss money depending on whether the stock price of the company you invest goes up or down.

18 MDP (Problem 4) Particularly, the stock prices change according to the following transition matrices: For Boston Medicine: For San Francisco Chips: End of Year Price: $5 End of Year Price: $10 End of Year Price: $15 Current Price: $540% 20% Current Price: $1025%50%25% Current Price: $1530%40%30% End of Year Price: $10 End of Year Price: $12 End of Year Price: $14 Current Price: $1020%60%20% Current Price: $1220%70%10% Current Price: $1415%60%25%

19 MDP (Problem 4) States? – Actions? – BM, SFC Rewards? –  (currPriceBM-prevPriceBM)*1000 –  (currPriceSFC - prevPriceSFC)*1000

20 MDP (Problem 4) Transitions? 0.4*0.20.4*0.60.4*0.2 0.4*0.60.4*0.20.2*0.20.2*0.60.2*0.2 0.4*0.20.4*0.70.4*0.10.4*0.20.4*0.70.4*0.10.2*0.20.2*0.70.2*0.1 0.3*0.150.3*0.60.3*0.250.4*0.150.4*0.60.4*0.250.3*0.150.3*0.60.3*0.25

21 Logic-Based vs. Decision-Theory Decision theory: – Utilities (rewards) – Uncertainties (transition probabilities) – View the world as states – Policy defines: given a state, which action to take Logic based (propositional, PDDL) – Goal state we want to reach – Actions with preconditions and deterministic affects – Factored state representation – In HTNs, Hierarchical representation of tasks

22 Which approach would you use? What approach would you use to model each of the following planning problems? If both options seem reasonable, explain the advantages and limitations of each: – Planning how your team should work on a class project – Programing a robot that participates in RoboCup – Deciding where to eat on campus every day

23 Other Questions Assume that we wanted to model what to eat in the dining room every day using an MDP. We defined the states as the available options, and we defined rewards based on our food preferences and taking into account other considerations as not wanting to eat the same food for two days in a row. – How would you go about defining the transition function? – If we use an optimal algorithm like value iteration to solve our MDP, are we guaranteed to have the optimal policy?


Download ppt "Section 10 Mid-term Review II. Topics Brew the coffee! Three operators: 1. load(x) precond:coffee(x), loaded(none) effects:loaded(x), ¬loaded(none) 2."

Similar presentations


Ads by Google