QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less.

QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less memory than Forward sampling. True  T/F: In likelihood weighted sampling you make use of every single sample. True  T/F: Forward sampling samples from the joint distribution. True  T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True  T/F: Too few samples could result in division by zero. True 1 xkcd

CSE511a: Artificial Intelligence Fall 2013 Lecture 18: Decision Diagrams 04/03/2013 Robert Pless, via Kilian Q. Weinberger Several slides adapted from Dan Klein – UC Berkeley

Announcements Class Monday April 8 cancelled in lieu of talk by Sanmay Das: “Bayesian Reinforcement Learning With Censored Observations and Applications in Market Modeling and Design” Friday, April 5, 11-12 Lopata 101. 3

Sampling Example  There are 2 cups.  The first contains 1 penny and 1 quarter  The second contains 2 quarters  Say I pick a cup uniformly at random, then pick a coin randomly from that cup. It's a quarter (yes!). What is the probability that the other coin in that cup is also a quarter?

Recap: Likelihood Weighting  Sampling distribution if z sampled and e fixed evidence  Now, samples have weights  Together, the weighted sampling distribution is consistent 5 Cloudy R C S W

Likelihood Weighting 6 +c0.5 -c0.5 +c +s0.1 -s0.9 -c +s0.5 -s0.5 +c +r0.8 -r0.2 -c +r0.2 -r0.8 +s +r +w0.99 -w0.01 -r +w0.90 -w0.10 -s +r +w0.90 -w0.10 -r +w0.01 -w0.99 Samples: +c, +s, +r, +w … Cloudy Sprinkler Rain WetGrass Cloudy Sprinkler Rain WetGrass

 Inference:  Sum over weights that match query value  Divide by total sample weight  What is P(C|+w,+s)? Likelihood Weighting Example 7 CloudyRainySprinklerWet GrassWeight 01110.495 00110.45 0011 0011 10110.09

Likelihood Weighting  Likelihood weighting is good  Takes evidence into account as we generate the sample  At right: W’s value will get picked based on the evidence values of S, R  More samples will reflect the state of the world suggested by the evidence  Likelihood weighting doesn’t solve all our problems  Evidence influences the choice of downstream variables, but not upstream ones (C isn’t more likely to get a value matching the evidence)  We would like to consider evidence when we sample every variable 8 Cloudy Rain C S R W

Markov Chain Monte Carlo*  Idea: instead of sampling from scratch, create samples that are each like the last one.  Procedure: resample one variable at a time, conditioned on all the rest, but keep evidence fixed. E.g., for P(b|c):  Properties: Now samples are not independent (in fact they’re nearly identical), but sample averages are still consistent estimators!  What’s the point: both upstream and downstream variables condition on evidence. 9 +a+c+b +a+c-b-b-a-a -b-b

Gibbs Sampling 1.Set all evidence E 1,...,E m to e 1,...,e m 2.Do forward sampling t obtain x 1,...,x n 3.Repeat: 1.Pick any variable X i uniformly at random. 2.Resample x i ’ from p(X i | markovblanket(X i )) 3.Set all other x j ’=x j 4.The new sample is x 1 ’,..., x n ’ 10

Markov Blanket 11 X Markov blanket of X: 1.All parents of X 2.All children of X 3.All parents of children of X (except X itself) X is conditionally independent from all other variables in the BN, given all variables in the markov blanket (besides X).

MCMC algorithm 12

The Markov Chain 13

Summary  Sampling can be your salvation  Dominant approach to inference in BNs  Pros/Conts:  Forward Sampling  Rejection Sampling  Likelihood Weighted Sampling  Gibbs Sampling/MCMC 14

Google’s PageRank 15 http://en.wikipedia.org/wiki/Markov_chain Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999). The PageRank citation ranking: Bringing order to the WebThe PageRank citation ranking: Bringing order to the Web. See also: J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM- SIAM Symposium on Discrete Algorithms, 1998.Authoritative sources in a hyperlinked environment.

Google’s PageRank 16 A A B B C C D D E E F F H H I I J J G G Graph of the Internet (pages and links)

Google’s PageRank 17 A A B B C C D D E E F F H H I I J J G G Start at a random page, take a random walk. Where do we end up?

Google’s PageRank 18 A A B B C C D D E E F F H H I I J J G G Add 15% probability of moving to a random page. Now where do we end up?

Google’s PageRank 19 A A B B C C D D E E F F H H I I J J G G PageRank(P) = Probability that a long random walk ends at node P

Learning 21

Where do the CPTs come from?  How do you build a Bayes Net to begin with?  Two scenarios:  Complete data  Incomplete data --- touch on, you are not responsible for this. 22

Learning from complete data  Observe data from real world:  Estimate probabilities by counting:  (Just like rejection sampling, except we observe the real world.) 23 CloudyRainySprinklerWet Grass 0011 1101 1000 0011 0011 0011 1101 0011 1000 1101 0011 1101 0011 1101 1101 0101 0010 (Indicator function I(a,b)=1 if and only if a=b.)

Learning from incomplete data  Observe real-world data  Some variables are unobserved  Trick: Fill them in  EM algorithm  E=Expectation  M=Maximization 24 CloudyRainySprinklerWet Grass 1??0 0??1 1??1 1??0 1??1 1??1 0??1 1??0 1??0 0??1 0??0 0??1 1??0 1??1 1??1 1??0 1??1 0??1 1??0 1??0

EM-Algorithm  Initialize CPTs (e.g. randomly, uniformly)  Repeat until convergence  E-Step:  Compute for all i and x (inference)  M-Step:  Update CPT tables  (Just like likelihood-weighted sampling, but on real-world data.) 25

EM-Algorithm  Properties:  No learning rate  Monotonic convergence  Each iteration improves log-likelihood of data  Outcome depends on initialization  Can be slow for large Bayes Nets... 26

Decision Networks 27

Decision Networks  MEU: choose the action which maximizes the expected utility given the evidence  Can directly operationalize this with decision networks  Bayes nets with nodes for utility and actions  Lets us calculate the expected utility for each action  New node types:  Chance nodes (just like BNs)  Actions (rectangles, cannot have parents, act as observed evidence)  Utility node (diamond, depends on action and chance nodes) 28 Weather Forecast Umbrella U

Decision Networks  Action selection:  Instantiate all evidence  Set action node(s) each possible way  Calculate posterior for all parents of utility node, given the evidence  Calculate expected utility for each action  Choose maximizing action 29 Weather Forecast Umbrella U

Example: Decision Networks Weather Umbrella U WP(W) sun0.7 rain0.3 AWU(A,W) leavesun100 leaverain0 takesun20 takerain70 Umbrella = leave Umbrella = take Optimal decision = leave

Decisions as Outcome Trees  Almost exactly like expectimax / MDPs 31 U(t,s) Weather take leave {} sun U(t,r) rain U(l,s)U(l,r) rain sun

Evidence in Decision Networks  Find P(W|F=bad)  Select for evidence  First we join P(W) and P(bad|W)  Then we normalize 32 Weather Forecast WP(W) sun0.7 rain0.3 FP(F|rain) good0.1 bad0.9 FP(F|sun) good0.8 bad0.2 WP(W) sun0.7 rain0.3 WP(F=bad|W) sun0.2 rain0.9 WP(W,F=bad) sun0.14 rain0.27 WP(W | F=bad) sun0.34 rain0.66 Umbrella U

Example: Decision Networks 33 Weather Forecast =bad Umbrella U AWU(A,W) leavesun100 leaverain0 takesun20 takerain70 WP(W|F=bad) sun0.34 rain0.66 Umbrella = leave Umbrella = take Optimal decision = take

Value of Information  Idea: compute value of acquiring evidence  Can be done directly from decision network  Example: buying oil drilling rights  Two blocks A and B, exactly one has oil, worth k  You can drill in one location  Prior probabilities 0.5 each, & mutually exclusive  Drilling in either A or B has MEU = k/2  Question: what’s the value of information?  Value of knowing which of A or B has oil  Value is expected gain in MEU from new info  Survey may say “oil in a” or “oil in b,” prob 0.5 each  If we know OilLoc, MEU is k (either way)  Gain in MEU from knowing OilLoc?  VPI(OilLoc) = k/2  Fair price of information: k/2 34 OilLoc DrillLoc U DOU aak ab0 ba0 bbk OP a1/2 b

Value of Information  Assume we have evidence E=e. Value if we act now:  Assume we see that E’ = e’. Value if we act then:  BUT E’ is a random variable whose value is unknown, so we don’t know what e’ will be  Expected value if E’ is revealed and then we act:  Value of information: how much MEU goes up by revealing E’ first: VPI == “Value of perfect information” P(s | e) {e} a U {e, e’} a P(s | e, e’) U {e} P(e’ | e) {e, e’}

VPI Example: Weather 36 Weather Forecast Umbrella U AWU leavesun100 leaverain0 takesun20 takerain70 MEU with no evidence MEU if forecast is bad MEU if forecast is good FP(F) good0.59 bad0.41 Forecast distribution

VPI Properties  Nonnegative  Nonadditive ---consider, e.g., obtaining E j twice  Order-independent 37

Quick VPI Questions  The soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is?  There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which?  You have $10 to bet double-or-nothing and there is a 75% chance that the Rams will beat the 49ers. What’s the value of knowing the outcome in advance?  You must bet on the Rams, either way. What’s the value now?

That’s all  Next Lecture  Hidden Markov Models!  (You gonna love it!!)  Reminder, next lecture is a week from today (on Wednesday). Go to talk by Sanmay Das, Lopata 101, Friday 11-12. 39

QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less.

Similar presentations

Presentation on theme: "QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less.

Similar presentations

Presentation on theme: "QUIZ!!  T/F: Forward sampling is consistent. True  T/F: Rejection sampling is faster, but inconsistent. False  T/F: Rejection sampling requires less."— Presentation transcript:

Similar presentations

About project

Feedback