Assignment 1 Solutions. Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)

Assignment 1 Solutions

Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)

Problem 1 contd. Transitions: Explained by example. –For the action, the state transitions will be:  = 0.9 –0.8 for stay where you are action –0.05 for north –0.05 for east  = 0.05 –0.05 for south  = 0.05 –0.05 for west

Problem 1 contd. Goal state: states where atleast one detective has the same position as criminal –Ex: (1,2,1), (5,1,1) etc. Reward function will vary from person to person. But a reward function can be: –R(Goal state) = 100 –R(Goal state, *) = 0 –R(!(Goal state), *) = -2 –Example: R ([1,2,1]) = 100 ;R ([1,2,1], *) = 0; R([1,2,3], *) = -2

Problem 2 Implement value iteration Provide policies given only the start state. –For example, for (a) Start state is. Best action needs to be provided for (T=1). With the above reward function, the action is. At T = 2, Best action for –  ;  ;  Goal state (any action is fine) At T = 3, Best action for –  Goal state;  ;  D12 (0) (1) C (2) (3)(4)(5) (6)(7)(8)

Problem 3 Calculate all paths (for criminal) of size 5. Find the average number of moves used by the detectives to catch the thief in the paths enumerated above. In the above MDP, the average was 2.4.

Problem 4 It is not possible to define the reward (to accommodate the rule on T=4) given the above state space. State space needs to be modified to include time. Without the additional state feature for time, the problem does not have the markov property.

Assignment 1 Solutions. Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)

Similar presentations

Presentation on theme: "Assignment 1 Solutions. Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Assignment 1 Solutions. Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)

Similar presentations

Presentation on theme: "Assignment 1 Solutions. Problem 1 States : Actions: Single MDP controlling both detectives D1 (0) (1) C (2) D2 (3) (4)(5) (6)(7)(8)"— Presentation transcript:

Similar presentations

About project

Feedback