Download presentation
Presentation is loading. Please wait.
Published byTyrone Terry Modified over 9 years ago
1
Lecture 29 Conditional Independence, Bayesian networks intro Ch 6.3, 6.3.1, 6.5, 6.5.1, 6.5.2 1
2
Announcement Assignment 4 will be out on Wed. Due Wed. April 8 Final is on April 17 Similar format as the midterm (mix of short conceptual question from the posted list and of problem solving questions) Remember that you have to pass the final in order to pass the course 2
3
Lecture Overview Recap lecture 28 More on Conditional Independence Bayesian Networks Introduction 3
4
Chain Rule 4 Allows representing a Join Probability Distribution (JPD) as the product of conditional probability distributions
5
Marginal Independence Intuitively: if X ╨ Y, then learning that Y=y does not change your belief in X and this is true for all values y that Y could take For example, weather is marginally independent of the result of a coin toss 5
6
Exploiting marginal independence Recall the product rule p(X=x ˄ Y=y) = p(X=x | Y=y) × p(Y=y) If X and Y are marginally independent, p(X=x | Y=y) = p(X=x) Thus we have p(X=x ˄ Y=y) = p(X=x) × p(Y=y) In distribution form p(X,Y) = p(X) × p(Y) 6
7
Exploiting marginal independence Exponentially fewer than the JPD! 7
8
Conditional Independence Intuitively: if X and Y are conditionally independent given Z, then learning that Y=y does not change your belief in X when we already know Z=z and this is true for all values y that Y could take and all values z that Z could take 8
9
Example for Conditional Independence However, whether light l 1 is lit (Lit-l 1 ) is conditionally independent from the position of switch s 2 (Up-s 2 ) given whether there is power in wire w 0 Once we know Power-w 0, learning values for Up-s 2 does not change our beliefs about Lit-l 1 I.e., Lit-l 1 is conditionally independent of Up-s 2 given Power-w 0 Power-w 0 Lit-l 1 Up-s 2 9 Whether light l 1 is lit and the position of switch s 2 are not marginally independent The position of the switch determines whether there is power in the wire w 0 connected to the light Lit-l 1 Up-s 2
10
Lecture Overview Recap lecture 28 More on Conditional Independence Bayesian Networks Introduction 10
11
Another example of conditionally but not marginally independent variables ExamGrade and AssignmentGrade are not marginally independent Students who do well on one typically do well on the other, and viceversa But conditional on UnderstoodMaterial, they are independent Variable UnderstoodMaterial is a common cause of variables ExamGrade and AssignmentGrade Knowing UnderstoodMaterial shields any information we could get from AssignmentGrade toward Exam grade (and viceversa) Understood Material Assignment Grade Exam Grade 11 Assignment Grade Exam Grade
12
Example: marginally but not conditionally independent But they are not conditionally independent given alarm E.g., if the alarm rings and you learn S=true your belief in F decreases Alarm Smoking At Sensor Fire 12 Two variables can be marginally but not conditionally independent “Smoking At Sensor” S: resident smokes cigarette next to fire sensor “Fire” F: there is a fire somewhere in the building “Alarm” A: the fire alarm rings S and F are marginally independent Learning S=true or S=false does not change your belief in F Smoking At Sensor Fire
13
Conditional vs. Marginal Independence Two variables can be Conditionally but not marginally independent ExamGrade and AssignmentGrade ExamGrade and AssignmentGrade given UnderstoodMaterial Lit l1 and Up(s2) Lit l1 and Up(s2) given Marginally but not conditionally independent SmokingAtSensor and Fire SmokingAtSensor and Fire given Alarm Both marginally and conditionally independent CanucksWinStanleyCup and Lit(l 1 ) CanucksWinStanleyCup and Lit(l 1 ) given Power(w 0 ) Neither marginally nor conditionally independent Temperature and Cloudiness Temperature and Cloudiness given Wind Understood Material Assignment Grade Exam Grade Alarm Smoking At Sensor Fire Power(w 0 ) Lit(l 1 ) Up(s 2 ) Power(w 0 ) Lit(l 1 ) Canucks Win Tempera ture Claudiness Wind 13
14
Exploiting Conditional Independence Example 1: Boolean variables A,B,C C is conditionally independent of A given B We can then rewrite P(C | A,B) as P(C|B) 14
15
Exploiting Conditional Independence Example 2: Boolean variables A,B,C,D D is conditionally independent of A given C D is conditionally independent of B given C We can then rewrite P(D | A,B,C) as P(D|B,C) And can further rewrite P(D|B,C) as P(D|C) 15
16
Exploiting Conditional Independence If, for instance, D is conditionally independent of A and B given C, we can rewrite the above as Under independence we gain compactness The chain rule allows us to write the JPD as a product of conditional distributions Conditional independence allows us to write them compactly 16
17
Bayesian (or Belief) Networks Bayesian networks and their extensions are Representation & Reasoning systems explicitly defined to exploit independence in probabilistic reasoning 17
18
Lecture Overview Recap lecture 28 More on Conditional Independence Bayesian Networks Introduction 18
19
Bayesian Network Motivation We want a representation and reasoning system that is based on conditional (and marginal) independence Compact yet expressive representation Efficient reasoning procedures Bayesian (Belief) Networks are such a representation Named after Thomas Bayes (ca. 1702 –1761) Term coined in 1985 by Judea Pearl (1936 – ) Their invention changed the primary focus of AI from logic to probability! Thomas Bayes Judea Pearl In 2012 Pearl received the very prestigious ACM Turing Award for his contributions to Artificial Intelligence!received the very prestigious ACM Turing Award 19
20
Bayesian Networks: Intuition A graphical representation for a joint probability distribution Nodes are random variables Directed edges between nodes reflect dependence Some informal examples: Understood Material Assignment Grade Exam Grade Alarm Smoking At Sensor Fire Power-w 0 Lit-l 1 Up-s 2 20
21
Belief (or Bayesian) networks Def. A Belief network consists of a directed, acyclic graph (DAG) where each node is associated with a random variable X i A domain for each variable X i a set of conditional probability distributions for each node X i given its parents Pa(X i ) in the graph P (X i | Pa(X i )) The parents Pa(X i ) of a variable X i are those X i directly depends on A Bayesian network is a compact representation of the JDP for a set of variables (X 1, …,X n ) P(X 1, …,X n ) = ∏ n i= 1 P (X i | Pa(X i )) 21
22
Bayesian Networks: Definition Discrete Bayesian networks: Domain of each variable is finite Conditional probability distribution is a conditional probability table We will assume this discrete case But everything we say about independence (marginal & conditional) carries over to the continuous case Def. A Belief network consists of –a directed, acyclic graph (DAG) where each node is associated with a random variable X i –A domain for each variable X i –a set of conditional probability distributions for each node X i given its parents Pa(X i ) in the graph P (X i | Pa(X i )) 22
23
1. Define a total order over the random variables: (X 1, …,X n ) 2. Apply the chain rule P(X 1, …,X n ) = ∏ n i= 1 P(X i | X 1, …,X i-1 ) 3.For each X i,, select the smallest set of predecessors Pa (X i ) such that P(X i | X 1, …,X i-1 ) = P (X i | Pa(X i )) 4.Then we can rewrite P(X 1, …,X n ) = ∏ n i= 1 P (X i | Pa(X i )) This is a compact representation of the initial JPD factorization of the JPD based on existing conditional independencies among the variables How to build a Bayesian network Predecessors of X i in the total order defined over the variables X i is conditionally independent from all its other predecessors given Pa(X i ) 23
24
How to build a Bayesian network (cont’d) 5. Construct the Bayesian Net (BN) Nodes are the random variables Draw a directed arc from each variable in Pa(X i ) to X i Define a conditional probability table (CPT) for each variable X i : P(X i | Pa(X i )) 24
25
Example for BN construction: Fire Diagnosis You want to diagnose whether there is a fire in a building You can receive reports (possibly noisy) about whether everyone is leaving the building If everyone is leaving, this may have been caused by a fire alarm If there is a fire alarm, it may have been caused by a fire or by tampering If there is a fire, there may be smoke Start by choosing the random variables for this domain, here all are Boolean: Tampering (T) is true when the alarm has been tampered with Fire (F) is true when there is a fire Alarm (A) is true when there is an alarm Smoke (S) is true when there is smoke Leaving (L) is true if there are lots of people leaving the building Report (R) is true if the sensor reports that lots of people are leaving the building Next apply the procedure described earlier 25
26
Example for BN construction: Fire Diagnosis 1.Define a total ordering of variables: - Let’s chose an order that follows the causal sequence of events - Fire (F), Tampering (T), Alarm, (A), Smoke (S) Leaving (L) Report (R) 2.Apply the chain rule P(F,T,A,S,L,R) = We will do steps 3, 4 and 5 together, for each element P(X i | X 1, …,X i-1 ) of the factorization 3.For each variable (X i ), choose the parents Parents(X i ) by evaluating conditional independencies, so that P(X i | X 1, …,X i-1 ) = P (X i | Parents (X i )) 4.Rewrite P(X 1, …,X n ) = ∏ n i= 1 P (X i | Parents (X i )) 5. Construct the Bayesian network 26
27
P(F)P (T | F) P (A | F,T) P (S | F,T,A) P (L | F,T,A,S) P (R | F,T,A,S,L) Fire (F) is the first variable in the ordering, X 1. It does not have parents. Fire Diagnosis Example Fire 27
28
P(F)P (T | F) P (A | F,T) P (S | F,T,A) P (L | F,T,A,S) P (R | F,T,A,S,L) Tampering (T) is independent of fire (learning that one is true would not change your beliefs about the probability of the other) Example Tampering Fire 28
29
P(F)P (T ) P (A | F,T) P (S | F,T,A) P (L | F,T,A,S) P (R | F,T,A,S,L) Alarm (A) depends on both Fire and Tampering: it could be caused by either or both Fire Diagnosis Example Tampering Fire Alarm 29
30
P(F)P (T | F) P (A | F,T) P (S | F,T,A) P (L | F,T,A,S) P (R | F,T,A,S,L) Smoke (S) is caused by Fire, and so is independent of Tampering and Alarm given whether there is a Fire Fire Diagnosis Example Tampering Fire Alarm Smoke 30
31
P(F)P (T | F) P (A | F,T) P (S | F) P (L | F,T,A,S) P (R | F,T,A,S,L) Leaving (L) is caused by Alarm, and thus is independent of the other variables given Alarm Example Tamperin g Fire Alarm Smoke Leaving 31
32
P(F)P (T ) P (A | F,T) P (S | F) P (L | A) P (R | F,T,A,S,L) Report ( R) is caused by Leaving, and thus is independent of the other variables given Leaving Fire Diagnosis Example Tamperin g Fire Alarm Smoke Leaving Report 32
33
P(F)P (T ) P (A | F,T) P (S | F) P (L | A) P (R | L) The result is the Bayesian network above, and its corresponding, very compact factorization of the original JPD P(F,T,A,S,L,R)= P(F)P (T ) P (A | F,T) P (S | F) P (L | A) P (R | L) Fire Diagnosis Example Tamperin g Fire Alarm Smoke Leaving Report 33
34
Define and give examples of random variables, their domains and probability distributions Calculate the probability of a proposition f given µ(w) for the set of possible worlds Define a joint probability distribution (JPD) Marginalize over specific variables to compute distributions over any subset of the variables Given a JPD Marginalize over specific variables Compute distributions over any subset of the variables Prove the formula to compute conditional probability P(h|e) Use inference by enumeration to compute joint posterior probability distributions over any subset of variables given evidence Derive and use Bayes Rule Derive the Chain Rule Define and use marginal independence Define and use conditional independence Build a Bayesian Network for a given domain Learning Goals For Probability so far 34
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.