Bayesian Networks Based on

Bayesian Networks Based on
David Heckerman’s Tutorial (Microsoft Research) And Nir Friedman’s course (Hebrew University) Automated Planning and Decision Making 2007

Automated Planning and Decision Making
Representing State In classical planning At each point, we know the exact state of the world For each action, we know the precise effects In many single-step decision problems There is much uncertainty about the current state and the effect of actions In decision-theoretic planning problems Uncertainty about the state Uncertainty about the effects of actions Automated Planning and Decision Making

So far In MDPs/POMDPs states had no structure In real-life, they represent the value of multiple variables Their number is exponential in the number of variables Automated Planning and Decision Making

What we need We need a compact representation of our uncertainty about the state of the world and the effect of actions that we can efficiently manipulate Solution: Bayesian Networks (BN) BNs are also the basis for modern expert systems Automated Planning and Decision Making

Probability Distributions
Let X1,…,Xn be random variables Let P be a joint distribution over X1,…,Xn If the variables are binary, then we need O(2n) parameters to describe P Can we do better? Key idea: use properties of independence Automated Planning and Decision Making

Independent Random Variables
Two variables X and Y are independent if P(X = x|Y = y) = P(X = x) for all values x,y That is, learning the values of Y does not change prediction of X If X and Y are independent then P(X,Y) = P(X|Y)P(Y) = P(X)P(Y) In general, if X1,…,Xn are independent, then P(X1,…,Xn)= P(X1)...P(Xn) Requires O(n) parameters Automated Planning and Decision Making

Conditional Independence
Unfortunately, most random variables of interest are not independent of each other A more suitable notion is that of conditional independence Two variables X and Y are conditionally independent given Z if P(X = x|Y = y,Z=z) = P(X = x|Z=z) for all values x,y,z That is, learning the values of Y does not change prediction of X once we know the value of Z notation: Ind( X ; Y | Z ) Automated Planning and Decision Making

Example: Family trees Example: Pedigree A node represents an individual’s genotype Modeling assumptions: Ancestors can effect descendants' genotype only by passing genetic materials through intermediate generations Marge Homer Lisa Maggie Bart Automated Planning and Decision Making

Bayesian Network Directed Acyclic Graph, annotated with prob distributions Fuel Gauge Start Battery Engine Turns Over p(b) p(t|b) p(s|f,t) p(f) Automated Planning and Decision Making

BN structure Definition
Missing arcs encode independencies such that Battery Fuel p(b) p(f) Fuel Gauge Engine Turns Over p(t|b) Start p(s|f,t) Automated Planning and Decision Making

Independence in a Bayes net
Example: Many other independencies are entailed by (*) Can be read from the graph using d-separation (Pearl) Automated Planning and Decision Making

Markov Assumption Ancestor We now make this independence assumption more precise for directed acyclic graphs (DAGs) Each random variable X, is independent of its non-descendants, given its parents Pa(X) Formally, Ind(X; NonDesc(X) | Pa(X)) Y1 Y2 Parent X Non-descendent Non-descendent Descendent Automated Planning and Decision Making

Markov Assumption Example
In this example: Ind( E; B ) Ind( B; E, R ) Ind( R; A, B, C | E ) Ind( A; R | B,E ) Ind( C; B, E, R | A) Earthquake Burglary Radio Alarm Call Automated Planning and Decision Making

I-Maps A DAG G is an I-Map of a distribution P if all Markov assumptions implied by G are satisfied by P (Assuming G and P both use the same set of random variables) Examples: X Y X Y Automated Planning and Decision Making

Factorization Given that G is an I-Map of P, can we simplify the representation of P? Example: Since Ind(X;Y), we have that P(X|Y) = P(X) Applying the chain rule P(X,Y) = P(X|Y) P(Y) = P(X) P(Y) Thus, we have a simpler representation of P(X,Y) X Y Automated Planning and Decision Making

Factorization Theorem
Thm: if G is an I-Map of P, then Proof: By chain rule: W.l.o.g. X1,…,Xn is an ordering consistent with G From assumption: Since G is an I-Map, Ind(Xi; NonDesc(Xi)| Pa(Xi)) Hence, We conclude: P(Xi | X1,…,Xi-1) = P(Xi | Pa(Xi) ) Automated Planning and Decision Making

Consequences We can write P in terms of “local” conditional probabilities If G is sparse, that is, |Pa(Xi)| < k ,  each conditional probability can be specified compactly e.g. for binary variables, these require O(2k) params.  representation of P is compact linear in number of variables Automated Planning and Decision Making

Conditional Independencies
Let Markov(G) be the set of Markov Independencies implied by G The decomposition theorem shows G is an I-Map of P  We can also show the opposite: Thm:  G is an I-Map of P Automated Planning and Decision Making

Proof (Outline) X Z Example: Y Automated Planning and Decision Making

Implied Independencies
Does G imply additional independencies as a consequence of Markov(G) We can define a logic of independence statements We’ve already seen some axioms: Ind( X ; Y | Z )  Ind( Y; X | Z ) Ind( X ; Y1, Y2 | Z )  Ind( X; Y1 | Z ) We can continue this list.. Automated Planning and Decision Making

d-seperation A procedure d-sep(X; Y | Z, G) that given a DAG G, and sets X, Y, and Z returns either yes or no Goal: d-sep(X; Y | Z, G) = yes iff Ind(X;Y|Z) follows from Markov(G) Automated Planning and Decision Making

Paths Intuition: dependency must “flow” along paths in the graph A path is a sequence of neighboring variables Examples: R  E  A  B C  A  E  R Earthquake Burglary Radio Alarm Call Automated Planning and Decision Making

Paths blockage We want to know when a path is active -- creates dependency between end nodes blocked -- cannot create dependency end nodes We want to classify situations in which paths are active given the evidence. Automated Planning and Decision Making

Path Blockage Three cases: Common cause Blocked Active E R A E R A Automated Planning and Decision Making

Path Blockage Three cases: Common cause Intermediate cause Blocked Active E E A A C C Automated Planning and Decision Making

Path Blockage Three cases: Common cause Intermediate cause Common Effect Blocked Active E B A E B A C E B A C C Automated Planning and Decision Making

Path Blockage -- General Case
A path is active, given evidence Z, if Whenever we have the configuration B or one of its descendents are in Z No other nodes in the path are in Z A path is blocked, given evidence Z, if it is not active. A C B Automated Planning and Decision Making

Example d-sep(R,B) = yes E B R A C Automated Planning and Decision Making

Example d-sep(R,B) = yes d-sep(R,B|A) = no E B R A C Automated Planning and Decision Making

Example d-sep(R,B) = yes d-sep(R,B|A) = no d-sep(R,B|E,A) = yes E B R A C Automated Planning and Decision Making

d-Separation X is d-separated from Y, given Z, if all paths from a node in X to a node in Y are blocked, given Z. d-separation checks can be done efficiently (linear time in number of edges) Bottom-up phase: Mark all nodes whose descendants are in Z X to Y phase: Traverse (BFS) all edges on paths from X to Y and check if they are blocked Automated Planning and Decision Making

Soundness Theorem: If G is an I-Map of P d-sep( X; Y | Z, G ) = yes then P satisfies Ind( X; Y | Z ) Informally, Any independence reported by d-separation is satisfied by underlying distribution Automated Planning and Decision Making

Completeness Theorem: If d-sep( X; Y | Z, G ) = no then there is a distribution P such that G is an I-Map of P P does not satisfy Ind( X; Y | Z ) Informally, Any independence not reported by d-separation might be violated by the underlying distribution We cannot determine this by examining the graph structure alone Automated Planning and Decision Making

I-Maps revisited The fact that G is I-Map of P might not be that useful For example, complete DAGs A DAG G is complete if we cannot add an arc without creating a cycle These DAGs do not imply any independencies Thus, they are I-Maps of any distribution X1 X2 X1 X2 X3 X4 X3 X4 Automated Planning and Decision Making

Minimal I-Maps A DAG G is a minimal I-Map of P if G is an I-Map of P If G’  G, then G’ is not an I-Map of P Removing any arc from G introduces (conditional) independencies that do not hold in P Automated Planning and Decision Making

Minimal I-Map Example X1 X2 If is a minimal I-Map Then, these are not I-Maps: X3 X4 X1 X2 X1 X2 X3 X4 X3 X4 X1 X2 X1 X2 X3 X4 X3 X4 Automated Planning and Decision Making

Bayesian Networks A Bayesian network specifies a probability distribution via two components: A DAG G A collection of conditional probability distributions P(Xi|Pai) The joint distribution P is defined by the factorization Additional requirement: G is a minimal I-Map of P Automated Planning and Decision Making

Local distributions Table: p(S=y|T=n,F=e) = 0.0 p(S=y|T=n,F=n) = 0.0 p(S=y|T=y,F=e) = 0.0 p(S=y|T=y,F=n) = 0.99 T F TurnOver (yes, no) Fuel (empty, not) S Start (yes, no) Automated Planning and Decision Making

Local distributions T F TurnOver (yes, no) Fuel (empty, not) Tree S Start (yes, no) TurnOver yes no Fuel p(start)=0 empty not empty p(start)=0 p(start)=0.99 Automated Planning and Decision Making

Summary We explored DAGs as a representation of conditional independencies: Markov independencies of a DAG Tight correspondence between Markov(G) and the factorization defined by G d-separation, a sound & complete procedure for computing the consequences of the independencies Notion of minimal I-Map P-Maps This theory is the basis of Bayesian networks Automated Planning and Decision Making

Bayesian Networks Based on

Similar presentations

Presentation on theme: "Bayesian Networks Based on"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Bayesian Networks Based on

Similar presentations

Presentation on theme: "Bayesian Networks Based on"— Presentation transcript:

Similar presentations

About project

Feedback