CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 22- Forward probability and Robot Plan
Robotic Blocks World on(B, table) on(A, table) on(C, A) hand empty Robot hand Robot hand A C B A B C START GOAL on(B, table) on(A, table) on(C, A) hand empty clear(C) clear(B) on(C, table) on(B, C) on(A, B) hand empty clear(A)
Mapping the problem to probabilistic framework Exhaustively enumerate the states Enumerate the operators Define probabilities of transition P(Ok,sj|si) {probability of going from state si to sj with the output Ok which can be a robotic action}
State Transition C A B A B C C B B A C A unstack(C), putdown(C) START Robot hand Robot hand C A B unstack(C), putdown(C) A B C START pickup(B), stack(B,A) pickup(C), stack(C,B) C Robot hand B B A C A GOAL
States for Blocks world problem Total 22 states Hand – empty No column (1 state) 2-blk-column (6 states) 3-blk-column (6 states) Hand – holding Block A in Hand – no column (1 state), 2-blk-column (2 states) Block B in Hand – no column (1 state), 2-blk-column(2 states) Block C in Hand – no column (1 state), 2-blk-column(2 states)
State space and operators State space = {s1,s2, … , s22} Operators pick up A (PA), pick up A (PB), pick up A (PC) put down(DA), put down(DB), put down(DC) stack(x , y) – total 6 operators i.e. TAB, TAB, TCB, TBC, TCA, TAC unstack (x) – UA, UB, UC
Probabilistic Automaton This gives a probabilistic automaton where probability values are specified between every states for each operator. We need to learn total 22C2 (states) * 15 (operators) different probability values, e.g., P(PA, s2 | s1) = 0.3, P(DC, s5| s2), …
Formula for Operator Sequence Probability Forward Algorithm to calculate operator sequence probability. e.g. seq = UC DC PB TBA PC TCB P(seq) = P(UC DC PB TBA PC TCB ) (marginalization, probability of seq with 6th state = si ) = ∑ P(UC DC PB TBA PC TCB, s6 = si) 21 i = 0
Back to HMM
A Simple HMM r q a: 0.2 a: 0.3 b: 0.2 b: 0.1 a: 0.2 b: 0.1 b: 0.5
The forward probabilities of “bbba” Time Ticks 1 2 3 4 5 INPUT ε b bb bbb bbba 1.0 0.2 0.05 0.017 0.0148 0.0 0.1 0.07 0.04 0.0131 P(w,t) 0.3 0.12 0.057 0.0279