1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,

Slides:



Advertisements
Similar presentations
BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
Advertisements

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Graphical Models - Inference - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Lauritzen-Spiegelhalter Algorithm
1 Chap. 4 Decision Graphs Statistical Genetics Forum Bayesian Networks and Decision Graphs Finn V. Jensen Presented by Ken Chen Genome Sequencing Center.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Dynamic Bayesian Networks (DBNs)
An Introduction to Variational Methods for Graphical Models.
Introduction of Probabilistic Reasoning and Bayesian Networks
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
From Variable Elimination to Junction Trees
CSCI 121 Special Topics: Bayesian Networks Lecture #3: Multiply-Connected Graphs and the Junction Tree Algorithm.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Bayesian Networks A causal probabilistic network, or Bayesian network,
Recent Development on Elimination Ordering Group 1.
Bayesian Networks Clique tree algorithm Presented by Sergey Vichik.
December Marginal and Joint Beliefs in BN1 A Hybrid Algorithm to Compute Marginal and Joint Beliefs in Bayesian Networks and its complexity Mark.
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Exact Inference: Clique Trees
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Rule Generation [Chapter ]
Adaptive Importance Sampling for Estimation in Structured Domains L.E. Ortiz and L.P. Kaelbling.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Introduction to Bayesian Networks
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Solving Bayesian Decision Problems: Variable Elimination and Strong Junction Tree Methods Presented By: Jingsong Wang Scott Langevin May 8, 2009.
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
An Introduction to Variational Methods for Graphical Models
Bayesian Networks Aldi Kraja Division of Statistical Genomics.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.
Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Today Graphical Models Representing conditional dependence graphically
Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
INTRODUCTION TO Machine Learning 2nd Edition
Inference in Bayesian Networks
Exact Inference Continued
CSCI 5822 Probabilistic Models of Human and Machine Learning
An Introduction to Variational Methods for Graphical Models
Exact Inference ..
Class #19 – Tuesday, November 3
Class #16 – Tuesday, October 26
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Lecture 3: Exact Inference in GMs
Clique Tree Algorithm: Computation
Junction Trees 3 Undirected Graphical Models
Variable Elimination Graphical Models – Carlos Guestrin
Chapter 14 February 26, 2004.
Presentation transcript:

1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics, CGS Statistical Genetics Forum May 7,2007

2 Contents of the Book I A practical Guide to Normative Systems 1 Causal and Bayesian Network 2 Building Models 3 Learning, Adaption, and Tuning 4 Decision Graphs II Algorithms for Normative Systems 5 Belief Updating in Bayesian Network 6 Bayesian Network Analysis Tools 7 Algorithms for Influence Diagrams

3 Structure of the Book 1 Causal and Bayesian Network 2 Building Models 3 Learning, Adaption, and Tuning 5 Belief Updating in Bayesian Network 6 Bayesian Network Analysis Tools 4 Decision Graphs 7 Algorithms for Influence Diagrams I. What is BN? II. How to create a BN? III. What can we use BN to do? and how? [to know sth.] Prob.(a single variable | BN) Joint Prob.(a set variables | BN) Importance of varibales evidence sensitivity parameter sensitivity Data conflict analysis [to make decision] Optimal decision (cost & gain)

4 BN & Decision Tree A D1D1 V1V1 B T D2D2 V2V2 C U=V 1 +V 2 D1D1 T AXC V 1 +V 2 D2D2 AXC V 1 +V 2 AXC V 1 +V 2 D2D2 AXC V 1 +V 2 T AXC V 1 +V 2 D2D2 AXC V 1 +V 2 AXC V 1 +V 2 D2D2 AXC V 1 +V 2 P(A,C|D 1,T,D 2 )

5 “BN” of the Book Concept of BN Model Biulding (known part of structure) BN Learning (uncertain part of structure) BN (structure & parameters) Rules & TheoriesData & Algorithms Probability Calculation Knowing, Understanding & Explaining Decisions Actions Cost & Gain Changes

6 Chapter 5 Belief Updating in Bayesian Networks Belief = Probability Belief updating = Probability calculating based on a BN (model, parameters and/or evidences) Linear Model BN Logistic Model X1X2X3 e Y Conditional Probability P(Y| X1,X2,X3) Marginal Probability P(Y) =∑ [-Y] φ X2 X1 X3 Y C A B E D F

7 Marginal Probability Calculation in BN I. Simplification (5.5) II. Marginalization (5.2),(5.3),(5.4),(5.6) III. Simulation (5.7)

8 I. Simplifications Graph-theoretic Representation Definitions, Propositions & Theorems Barren Nodes D A B C F E G e D A B C F E G eG e D A B C F E D A B C F E e e D A B C F E e d-separation By excluding the non-informative nodes (white nodes)

9 II. Marginalization Calculating sums of products of potentials by eliminating variables repeatedly

10 Marginal Probabilities A B A1A2P(B) B1p1p2 p1+p2 P(B1) B2p3p4 p3+p4 P(B2) P(A) p1+p3 P(A1) p2+p4 P(A2) Joint Probabilities

11 An Example of Marginalization/Elimination BN parameters (potentials) : φ 1 =P (A 1 ), φ 2 =P (A 2 |A 1 ), φ 3 =P (A 3 |A 1 ), φ 4 =P (A 4 |A 2 ) φ 5 =P (A 5 |A 2, A 3 ), φ 6 =P (A 6 |A 3 ) P(A 4 )=? A3 A1 A2 A4A5A6 Distributive Law

12 Marginalization/Elimination Order A3 A1 A2 A4A5A6 Variable Elimination Order

13 Marginalization/Elimination Graph-theoretic Representation Definitions, Propositions & Theorems Domain: a set of variables in BN Potential : a real-valued probabilistic table over a domain φ 1 =P (A 1 ), φ 2 =P (A 2 |A 1 ), φ 3 =P (A 3 |A 1 ), φ 4 =P (A 4 |A 2 ) φ 5 =P (A 5 |A 2, A 3 ), φ 6 =P (A 6 |A 3 ) A3 A1 A2 A4A5A6 Definition 5.1 (Elimination) Let Ф be a set of potentials, and let X be a variable. X is eliminated from Ф by: 1.Remove all potentials in Ф with X in their domains. Call the removed set Ф X X= A 3 => Ф X =(φ 3, φ 5, φ 6 ), Ф =(φ 1, φ 2, φ 4 ) 2.Calculate φ -X = ∑ x Π Ф X = ∑ A3 φ 3 φ 5 φ 6 3.Add φ -X to Ф. Call the result set Ф -X =(φ 1, φ 2, φ 4, φ -X ) P(Y) is calculated by repeatedly eliminating the variables except Y Question : how to find an efficient/optimal elimination order?

14 Domain Graphs Graph-theoretic Representation Definitions, Propositions & Theorems BN graph 6 domains φ 1 (A 1 ), φ 2 (A 2,A 1 ), φ 3 (A 3,A 1 ), φ 4 (A 4,A 2 ) φ 5 (A 5,A 2,A 3 ), φ 6 (A 6,A 3 ) A3 A1 A2 A4A5A6 Domain graph 6 domains φ 1 (A 1 ), φ 2 (A 2,A 1 ), φ 3 (A 3,A 1 ), φ 4 (A 4,A 2 ) φ 5 (A 5,A 2,A 3 ), φ 6 (A 6,A 3 ) A3 A1 A2 A4A5A6

15 Perfect Elimination Sequence Graph-theoretic Representation Definitions, Propositions & Theorems Fill-ins (red links) Perfect Elimination Sequence An elimination sequence without introducing fill-ins. e.g. A6, A5, A3, A1, A2 down to A4 => P(A4) A5, A6, A3, A1, A2 down to A4 => P(A4) A1, A5, A6, A3, A2 down to A4 => P(A4) A3 A1 A2 A4A5A6 A1 A2 A4A5A6

16 Domain Set of Elimination Sequence Graph-theoretic Representation Definitions, Propositions & Theorems The domain set of an elimination sequence is the set of domains of potentials produced during the elimination where potentials that are subsets of other potentials are removed. For the sequence A6, A5, A3, A1, A2 down to A4 => P(A4) the set of domains is {(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)} Domain set reflects the complexity of an elimination sequence. Question: how to find the smallest domain set ?

17 Set of Cliques Graph-theoretic Representation Definitions, Propositions & Theorems All perfect elimination sequences produce the same the domain set, namely the set of cliques of the domain graph. e.g. all the sequences A6, A5, A3, A1, A2 down to A4 A5, A6, A3, A1, A2 down to A4 A1, A5, A6, A3, A2 down to A4 produce the domain set {(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)} which contains 5 domains / cliques Any perfect elimination sequence is optimal. Cliques are a set of domains produce by perfect elimination sequences. Clique set is the optimal set of domains. Question: how to determine the set of cliques?

18 Triangulated Graphs Graph-theoretic Representation Definitions, Propositions & Theorems An undirected graph with a perfect elimination sequence is called a triangulated graph. A triangulated graph A nontriangulated graph Perfect elimination sequence No perfect elimination sequence A5, A2, A4, A3 down to A1 A3 A1 A2 A4A5 A3 A1 A2 A4A5

19 Cliques in Triangulated Graphs Graph-theoretic Representation Definitions, Propositions & Theorems X : a node in domain graph Fx : the set of neighbor nodes of X plus X Simplicial: nodes with a complete neighbor set are called simplicial To determine the set of cliques in a triangulated graph 1. Eliminate a simplicial node X. Fx is a clique candidate. 2. If Fx does not include all remaining nodes, go to Prune the set of cliques candidates by removing sets that are subsets of other clique candidates. 4. The resulting set is the set of cliques. Question: given a set of cliques, how to determine the perfect elimination order? D A B C E X

20 Join Tree Graph-theoretic Representation Definitions, Propositions & Theorems An organized tree of cliques, in which all nodes on the path between V and W contain the intersection of V and W. D A B CF I E GH J ABCD V1 BCD S1 CGHJ V5 CG S5 BCDE V10 BCDG V1 BCD S1 DEFI V3 DE S3 ABCD CGHJ BCDE BCDG DEFIABCD CGHJ BCDE BCDG DEFI A domain graph Cliques (V) and Separators (S) A join tree Elimination sequence A,F,I,H,J,G,B,C,D down to E Not a join tree

21 Propagation Junction Trees Graph-theoretic Representation Definitions, Propositions & Theorems A junction tree is a join tree with the following structure: 1. Each potential is attached to a clique containing the domain of this potential (cliques) 2. Each link has the appropriate separator attached (separable) 3. Each separator contains two “mailboxes”, one for each direction (mutual communication) φ 1,φ 2,φ 3 V4: A1, A2, A3 φ 4 V6: A2, A4 φ 5 V2: A2, A3, A5 φ 6 V1: A3, A6 ↑ ↓ S4:A2 ↑ ↓ S2:A2,A3 ↑ ↓ S1:A3 Collect evidence to V6 distribute evidence from V6 Junction trees provide a general framework for finding optimal elimination sequence for triangulated graphs. Question: what if a graph is non-triangulated?

22 Triangulations Graph-theoretic Representation Definitions, Propositions & Theorems Convert a non-triangulated graph into a triangulated one by adding new link(s) BN non-triangulated graph triangulated graph D A BC E F G H IJ D A BC E F G H IJ D A BC E F G H IJ Optimal triangulation? Minimal fill-in size? Heuristic approach: eliminate repeatedly a smplicial node, and if this is not possible, eliminate a node X with minimal size of Fx.

23 III. Stochastic Simulations Forward Sampling 1. P(A) => A 2. P(B|A)=>B, P(C|A)=>C 3. P(D|B)=>D 4. P(E|C,D)=>E 5. Repeat steps 1~4 D A B C E Gibbs Sampling Evidence: B=n, E=n; P(B=n,E=n) is rare P(A)=? P(C| B=n,E=n, A=a 0, D=d 0 ) => c 1 P(D| B=n,E=n,C=c 1,A=a 0 ) => d 1 P(A| B=n,E=n, D=d 1,C=c 1 ) => a 1 P(C| B=n,E=n, A=a 1, D=d 1 ) => c 2.. discard P(C| B=n,E=n, A=a t-1, D=d t-1 ) => c t. collect.