1 BN Semantics 1 Graphical Models – 10708 Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, 3.3 10-708 –  Carlos.

Slides:



Advertisements
Similar presentations
1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
Advertisements

Identifying Conditional Independencies in Bayes Nets Lecture 4.
Bayesian Networks VISA Hyoungjune Yi. BN – Intro. Introduced by Pearl (1986 ) Resembles human reasoning Causal relationship Decision support system/ Expert.
Review: Bayesian learning and inference
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
Bayesian Network Representation Continued
Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.
Bayesian Networks Alan Ritter.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
1 EM for BNs Graphical Models – Carlos Guestrin Carnegie Mellon University November 24 th, 2008 Readings: 18.1, 18.2, –  Carlos Guestrin.
Probabilistic Reasoning
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
1 Bayesian Param. Learning Bayesian Structure Learning Graphical Models – Carlos Guestrin Carnegie Mellon University October 6 th, 2008 Readings:
Introduction to Bayesian Networks
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Bayes Nets recitation Bayes Nets Represent dependencies between variables Compact representation of probability distribution Flu Allergy.
Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.
1 EM for BNs Kalman Filters Gaussian BNs Graphical Models – Carlos Guestrin Carnegie Mellon University November 17 th, 2006 Readings: K&F: 16.2 Jordan.
Announcements Project 4: Ghostbusters Homework 7
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
1 Parameter Learning 2 Structure Learning 1: The good Graphical Models – Carlos Guestrin Carnegie Mellon University September 27 th, 2006 Readings:
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
1 BN Semantics 2 – The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 20 th, 2006 Readings: K&F:
1 Param. Learning (MLE) Structure Learning The Good Graphical Models – Carlos Guestrin Carnegie Mellon University October 1 st, 2008 Readings: K&F:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
CS B 553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Bayesian Networks.
1 Learning P-maps Param. Learning Graphical Models – Carlos Guestrin Carnegie Mellon University September 24 th, 2008 Readings: K&F: 3.3, 3.4, 16.1,
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
Weng-Keen Wong, Oregon State University © Bayesian Networks: A Tutorial Weng-Keen Wong School of Electrical Engineering and Computer Science Oregon.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
1 BN Semantics 3 – Now it’s personal! Parameter Learning 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 Readings:
CS 2750: Machine Learning Directed Graphical Models
Bayesian Networks: A Tutorial
General Gibbs Distribution
Structure and Semantics of BN
Markov Networks Independencies Representation Probabilistic Graphical
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 K&F: 7 (overview of inference) K&F: 8.1, 8.2 (Variable Elimination) Structure Learning in BNs 3: (the good,
CAP 5636 – Advanced Artificial Intelligence
Bayesian Networks Independencies Representation Probabilistic
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
CS 188: Artificial Intelligence Spring 2007
Variable Elimination 2 Clique Trees
Structure and Semantics of BN
Parameter Learning 2 Structure Learning 1: The good
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Unifying Variational and GBP Learning Parameters of MNs EM for BNs
BN Semantics 3 – Now it’s personal! Parameter Learning 1
Junction Trees 3 Undirected Graphical Models
Markov Networks Independencies Representation Probabilistic Graphical
CS 188: Artificial Intelligence Spring 2006
Variable Elimination Graphical Models – Carlos Guestrin
BN Semantics 2 – The revenge of d-separation
Presentation transcript:

1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos Guestrin

–  Carlos Guestrin Let’s start on BNs… Consider P(X i )  Assign probability to each x i 2 Val(X i )  Independent parameters Consider P(X 1,…,X n )  How many independent parameters if |Val(X i )|=k?

–  Carlos Guestrin What if variables are independent?  (X i  X j ), 8 i,j  Not enough!!! (See homework 1 )  Must assume that (X  Y), 8 X,Y subsets of {X 1,…,X n } Can write  P(X 1,…,X n ) =  i=1…n P(X i ) How many independent parameters now?

–  Carlos Guestrin Conditional parameterization – two nodes Grade is determined by Intelligence

–  Carlos Guestrin Conditional parameterization – three nodes Grade and SAT score are determined by Intelligence (G  S | I)

–  Carlos Guestrin The naïve Bayes model – Your first real Bayes Net Class variable: C Evidence variables: X 1,…,X n assume that (X  Y | C), 8 X,Y subsets of {X 1,…,X n }

–  Carlos Guestrin What you need to know (From last class) Basic definitions of probabilities Independence Conditional independence The chain rule Bayes rule Naïve Bayes

–  Carlos Guestrin This class We’ve heard of Bayes nets, we’ve played with Bayes nets, we’ve even used them in your research This class, we’ll learn the semantics of BNs, relate them to independence assumptions encoded by the graph

–  Carlos Guestrin Causal structure Suppose we know the following:  The flu causes sinus inflammation  Allergies cause sinus inflammation  Sinus inflammation causes a runny nose  Sinus inflammation causes headaches How are these connected?

–  Carlos Guestrin Possible queries Flu Allergy Sinus Headache Nose Inference Most probable explanation Active data collection

–  Carlos Guestrin Car starts BN 18 binary attributes Inference  P(BatteryAge|Starts=f) 2 18 terms, why so fast? Not impressed?  HailFinder BN – more than 3 54 = terms

–  Carlos Guestrin Factored joint distribution - Preview Flu Allergy Sinus Headache Nose

–  Carlos Guestrin Number of parameters Flu Allergy Sinus Headache Nose

–  Carlos Guestrin Key: Independence assumptions Flu Allergy Sinus Headache Nose Knowing sinus separates the symptom variables from each other

–  Carlos Guestrin (Marginal) Independence Flu and Allergy are (marginally) independent More Generally: Flu = tFlu = f Allergy = t Allergy = f Allergy = t Allergy = f Flu = t Flu = f

–  Carlos Guestrin Conditional independence Flu and Headache are not (marginally) independent Flu and Headache are independent given Sinus infection More Generally:

–  Carlos Guestrin The independence assumption Flu Allergy Sinus Headache Nose Local Markov Assumption: A variable X is independent of its non-descendants given its parents and only its parents (X i  NonDescendants Xi | Pa Xi )

–  Carlos Guestrin Explaining away Flu Allergy Sinus Headache Nose Local Markov Assumption: A variable X is independent of its non-descendants given its parents and only its parents (Xi  NonDescendants Xi | Pa Xi )

–  Carlos Guestrin What about probabilities? Conditional probability tables (CPTs) Flu Allergy Sinus Headache Nose

–  Carlos Guestrin Joint distribution Flu Allergy Sinus Headache Nose Why can we decompose? Local Markov Assumption!

–  Carlos Guestrin A general Bayes net Set of random variables Directed acyclic graph CPTs Joint distribution: Local Markov Assumption:  A variable X is independent of its non-descendants given its parents and only its parents – (Xi  NonDescendantsXi | PaXi)

–  Carlos Guestrin Announcements Homework 1:  Out wednesday  Due in 2 weeks – beginning of class!  It’s hard – start early, ask questions Collaboration policy  OK to discuss in groups  Tell us on your paper who you talked with  Each person must write their own unique paper  No searching the web, papers, etc. for answers, we trust you want to learn Audit policy  No sitting in, official auditors only, see couse website Don’t forget to register to the mailing list at: 

–  Carlos Guestrin Questions???? What distributions can be represented by a BN? What BNs can represent a distribution? What are the independence assumptions encoded in a BN?  in addition to the local Markov assumption

–  Carlos Guestrin Today: The Representation Theorem – Joint Distribution to BN Joint probability distribution: Obtain BN: Encodes independence assumptions If conditional independencies in BN are subset of conditional independencies in P

–  Carlos Guestrin Today: The Representation Theorem – BN to Joint Distribution If joint probability distribution: BN: Encodes independence assumptions Obtain Then conditional independencies in BN are subset of conditional independencies in P

–  Carlos Guestrin Let’s start proving it for naïve Bayes – From joint distribution to BN Independence assumptions:  X i independent given C Let’s assume that P satisfies independencies must prove that P factorizes according to BN:  P(C,X 1,…,X n ) = P(C)  i P(X i |C) Use chain rule!

–  Carlos Guestrin Let’s start proving it for naïve Bayes – From BN to joint distribution Let’s assume that P factorizes according to the BN:  P(C,X 1,…,X n ) = P(C)  i P(X i |C) Prove the independence assumptions:  X i independent given C  Actually, (X  Y | C), 8 X,Y subsets of {X 1,…,X n }

–  Carlos Guestrin Today: The Representation Theorem BN: Encodes independence assumptions Joint probability distribution: Obtain If conditional independencies in BN are subset of conditional independencies in P If joint probability distribution: Obtain Then conditional independencies in BN are subset of conditional independencies in P

–  Carlos Guestrin Local Markov assumption & I-maps Flu Allergy Sinus Headache Nose Local Markov Assumption: A variable X is independent of its non-descendants given its parents and only its parents (Xi  NonDescendants Xi | Pa Xi ) Local independence assumptions in BN structure G: Independence assertions of P: BN structure G is an I-map (independence map) if:

–  Carlos Guestrin Factorized distributions Given  Random vars X 1,…,X n  P distribution over vars  BN structure G over same vars P factorizes according to G if Flu Allergy Sinus Headache Nose

–  Carlos Guestrin BN Representation Theorem – I-map to factorization Joint probability distribution: Obtain If conditional independencies in BN are subset of conditional independencies in P G is an I-map of P P factorizes according to G

–  Carlos Guestrin BN Representation Theorem – I-map to factorization: Proof Local Markov Assumption: A variable X is independent of its non-descendants given its parents and only its parents (Xi  NonDescendants Xi | Pa Xi ) ALL YOU NEED: Flu Allergy Sinus Headache Nose Obtain G is an I-map of P P factorizes according to G

–  Carlos Guestrin Defining a BN Given a set of variables and conditional independence assertions of P Choose an ordering on variables, e.g., X 1, …, X n For i = 1 to n  Add X i to the network  Define parents of X i, Pa X i, in graph as the minimal subset of {X 1,…,X i-1 } such that local Markov assumption holds – X i independent of rest of {X 1,…,X i-1 }, given parents Pa Xi  Define/learn CPT – P(X i | Pa Xi )

–  Carlos Guestrin BN Representation Theorem – Factorization to I-map G is an I-map of P P factorizes according to G If joint probability distribution: Obtain Then conditional independencies in BN are subset of conditional independencies in P

–  Carlos Guestrin BN Representation Theorem – Factorization to I-map: Proof G is an I-map of P P factorizes according to G If joint probability distribution: Obtain Then conditional independencies in BN are subset of conditional independencies in P Homework 1!!!!

–  Carlos Guestrin The BN Representation Theorem If joint probability distribution: Obtain Then conditional independencies in BN are subset of conditional independencies in P Joint probability distribution: Obtain If conditional independencies in BN are subset of conditional independencies in P Important because: Every P has at least one BN structure G Important because: Read independencies of P from BN structure G

–  Carlos Guestrin Acknowledgements JavaBayes applet  me/index.html