Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 188: Artificial Intelligence Fall 2009 Lecture 16: Bayes’ Nets III – Inference 10/20/2009 Dan Klein – UC Berkeley.

Similar presentations


Presentation on theme: "CS 188: Artificial Intelligence Fall 2009 Lecture 16: Bayes’ Nets III – Inference 10/20/2009 Dan Klein – UC Berkeley."— Presentation transcript:

1 CS 188: Artificial Intelligence Fall 2009 Lecture 16: Bayes’ Nets III – Inference 10/20/2009 Dan Klein – UC Berkeley

2 Announcements  Midterm THURSDAY EVENING  6-9pm in 155 Dwinelle  One page (2 sides) of notes & basic calculator ok  Review session tonight, 6pm, 145 Dwinelle  No section this week  GSI OHs on midterm prep page  Assignments:  P2 in glookup (let us know if there are problems)  W3, P4 not until after midterm 2

3 Project 2 Mini-Contest!  Honorable mentions:  (8 wins, average 2171), Arpad Kovacs, Alan Young  (7 wins, average 2053), Eui Shin  (8 wins, average 2002), Katie Bedrosian, Edward Lin  2nd runner up: 5 wins, average 2445  Eric Chang, Peter Currier  Depth 3 alpha-beta with complex eval function that tried hard to eat ghosts. On games where they won, their score was very good.  1st runner up: 8 wins, average score 2471  Nils Reimers  Expectimax with adjustable depth, extremely conservative eval function (losing had - inf utility and winning had +inf), so the agent was pretty good at consistently winning.  Winner: 7 wins, average score 3044  James Ide, Xingyan Liu  Proper treatment of ghosts’ chance nodes. Elaborate evaluation function to detect if Pacman was trapped by ghosts and whether to get a power pellet or run. 3 [DEMO]

4 Inference  Inference: calculating some useful quantity from a joint probability distribution  Examples:  Posterior probability:  Most likely explanation: 4 BE A JM

5 Inference by Enumeration  Given unlimited time, inference in BNs is easy  Recipe:  State the marginal probabilities you need  Figure out ALL the atomic probabilities you need  Calculate and combine them  Example: 5 BE A JM

6 Example: Enumeration  In this simple method, we only need the BN to synthesize the joint entries 6

7 Inference by Enumeration? 7

8 Variable Elimination  Why is inference by enumeration so slow?  You join up the whole joint distribution before you sum out the hidden variables  You end up repeating a lot of work!  Idea: interleave joining and marginalizing!  Called “Variable Elimination”  Still NP-hard, but usually much faster than inference by enumeration  We’ll need some new notation to define VE 8

9 Factor Zoo I  Joint distribution: P(X,Y)  Entries P(x,y) for all x, y  Sums to 1  Selected joint: P(x,Y)  A slice of the joint distribution  Entries P(x,y) for fixed x, all y  Sums to P(x) 9 TWP hotsun0.4 hotrain0.1 coldsun0.2 coldrain0.3 TWP coldsun0.2 coldrain0.3

10 Factor Zoo II  Family of conditionals: P(X |Y)  Multiple conditionals  Entries P(x | y) for all x, y  Sums to |Y|  Single conditional: P(Y | x)  Entries P(y | x) for fixed x, all y  Sums to 1 10 TWP hotsun0.8 hotrain0.2 coldsun0.4 coldrain0.6 TWP coldsun0.4 coldrain0.6

11 Factor Zoo III  Specified family: P(y | X)  Entries P(y | x) for fixed y, but for all x  Sums to … who knows!  In general, when we write P(Y 1 … Y N | X 1 … X M )  It is a “factor,” a multi-dimensional array  Its values are all P(y 1 … y N | x 1 … x M )  Any assigned X or Y is a dimension missing (selected) from the array 11 TWP hotrain0.2 coldrain0.6

12 Example: Traffic Domain  Random Variables  R: Raining  T: Traffic  L: Late for class!  First query: P(L) 12 T L R +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9

13  Track objects called factors  Initial factors are local CPTs (one per node)  Any known values are selected  E.g. if we know, the initial factors are  VE: Alternately join factors and eliminate variables 13 Variable Elimination Outline +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +t+l0.3 -t+l0.1 +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9

14  First basic operation: joining factors  Combining factors:  Just like a database join  Get all factors over the joining variable  Build a new factor over the union of the variables involved  Example: Join on R  Computation for each entry: pointwise products 14 Operation 1: Join Factors +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +r+t0.08 +r-t0.02 -r+t0.09 -r-t0.81 T R R,T

15  In general, we join on a variable  Take all factors mentioning that variable  Join them all together with pointwise products  Result is P(all LHS vars | all non-LHS vars)  Leave other factors alone  Example: Join on T Operation 1: Join Factors +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +r+t+l0.24 +r+t-l0.56 +r-t+l0.02 +r-t-l0.18 -r+t+l0.03 -r+t-l0.07 -r-t+l0.09 -r-t-l0.81 T R L T, L R

16 Example: Multiple Joins 16 T R Join R L R, T L +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +r+t0.08 +r-t0.02 -r+t0.09 -r-t0.81 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9

17 Example: Multiple Joins 17 Join T R, T, L R, T L +r+t0.08 +r-t0.02 -r+t0.09 -r-t0.81 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +r+t+l 0.024 +r+t-l 0.056 +r-t+l 0.002 +r-t-l 0.018 -r+t+l 0.027 -r+t-l 0.063 -r-t+l 0.081 -r-t-l 0.729

18 Operation 2: Eliminate  Second basic operation: marginalization  Take a factor and sum out a variable  Shrinks a factor to a smaller one  A projection operation  Example: 18 +r+t0.08 +r-t0.02 -r+t0.09 -r-t0.81 +t0.17 -t0.83

19 Multiple Elimination 19 Sum out R Sum out T T, LL R, T, L +r+t+l 0.024 +r+t-l 0.056 +r-t+l 0.002 +r-t-l 0.018 -r+t+l 0.027 -r+t-l 0.063 -r-t+l 0.081 -r-t-l 0.729 +t+l 0.051 +t-l 0.119 -t+l 0.083 -t-l 0.747 +l0.134 -l0.886

20 P(L) : Marginalizing Early! 20 Sum out R T L +r+t0.08 +r-t0.02 -r+t0.09 -r-t0.81 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +t0.17 -t0.83 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 T R L +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 Join R R, T L

21 Marginalizing Early (aka VE*) Join TSum out T T, LL * VE is variable elimination T L +t0.17 -t0.83 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +t+l 0.051 +t-l 0.119 -t+l 0.083 -t-l 0.747 +l0.134 -l0.886

22  If evidence, start with factors that select that evidence  No evidence uses these initial factors:  Computing, the initial factors become:  We eliminate all vars other than query + evidence 22 Evidence +r0.1 -r0.9 +r+t0.8 +r-t0.2 -r+t0.1 -r-t0.9 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9 +r0.1 +r+t0.8 +r-t0.2 +t+l0.3 +t-l0.7 -t+l0.1 -t-l0.9

23  Result will be a selected joint of query and evidence  E.g. for P(L | +r), we’d end up with:  To get our answer, just normalize this!  That’s it! 23 Evidence II +l0.26 -l0.74 +r+l0.026 +r-l0.074 Normalize

24 General Variable Elimination  Query:  Start with initial factors:  Local CPTs (but instantiated by evidence)  While there are still hidden variables (not Q or evidence):  Pick a hidden variable H  Join all factors mentioning H  Eliminate (sum out) H  Join all remaining factors and normalize 24

25 Variable Elimination Bayes Rule 25 ABP +a+b0.08 +a bb 0.09 BAP +b+a0.8 b aa 0.2 bb +a0.1 bb aa 0.9 BP +b0.1 bb 0.9 a Ba, B Start / Select Join on BNormalize ABP +a+b8/17 +a bb 9/17

26 Example Choose A 26

27 Example Choose E Finish with B Normalize 27

28 Variable Elimination  What you need to know:  Should be able to run it on small examples, understand the factor creation / reduction flow  Better than enumeration: saves time by marginalizing variables as soon as possible rather than at the end  We will see special cases of VE later  On tree-structured graphs, variable elimination runs in polynomial time, like tree-structured CSPs  You’ll have to implement a tree-structured special case to track invisible ghosts (Project 4)


Download ppt "CS 188: Artificial Intelligence Fall 2009 Lecture 16: Bayes’ Nets III – Inference 10/20/2009 Dan Klein – UC Berkeley."

Similar presentations


Ads by Google