Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.

Slides:

Advertisements

Similar presentations

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Advertisements

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,

Exact Inference in Bayes Nets

Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.

Dynamic Bayesian Networks (DBNs)

For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.

Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.

. Hidden Markov Model Lecture #6. 2 Reminder: Finite State Markov Chain An integer time stochastic process, consisting of a domain D of m states {1,…,m}

Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.

From Variable Elimination to Junction Trees

CSCI 121 Special Topics: Bayesian Networks Lecture #3: Multiply-Connected Graphs and the Junction Tree Algorithm.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

M.I. Jaime Alfonso Reyes ´Cortés.  The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set.

Distributed Message Passing for Large Scale Graphical Models Alexander Schwing Tamir Hazan Marc Pollefeys Raquel Urtasun CVPR2011.

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.

. Hidden Markov Models Lecture #5 Prepared by Dan Geiger. Background Readings: Chapter 3 in the text book (Durbin et al.).

Global Approximate Inference Eran Segal Weizmann Institute.

. Hidden Markov Models For Genetic Linkage Analysis Lecture #4 Prepared by Dan Geiger.

December Marginal and Joint Beliefs in BN1 A Hybrid Algorithm to Compute Marginal and Joint Beliefs in Bayesian Networks and its complexity Mark.

Propagation in Poly Trees Given a Bayesian Network BN = {G, JDP} JDP(a,b,c,d,e) = p(a)*p(b|a)*p(c|e,b)*p(d)*p(e|d) a d b e c.

Belief Propagation, Junction Trees, and Factor Graphs

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

. Learning Parameters of Hidden Markov Models Prepared by Dan Geiger.

Exact Inference: Clique Trees

CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.

Hidden Markov Model Continues …. Finite State Markov Chain A discrete time stochastic process, consisting of a domain D of m states {1,…,m} and 1.An m.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Aspects of Bayesian Inference and Statistical Disclosure Control in Python Duncan Smith Confidentiality and Privacy Group CCSR University of Manchester.

. Basic Model For Genetic Linkage Analysis Lecture #5 Prepared by Dan Geiger.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

Belief Propagation. What is Belief Propagation (BP)? BP is a specific instance of a general class of methods that exist for approximate inference in Bayes.

Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.

Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.

Belief Propagation and its Generalizations Shane Oldenburger.

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Christopher M. Bishop, Pattern Recognition and Machine Learning 1.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.

Daphne Koller Overview Conditional Probability Queries Probabilistic Graphical Models Inference.

Today Graphical Models Representing conditional dependence graphically

Bayesian Belief Propagation for Image Understanding David Rosenberg.

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.

Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.

Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk

Inference in Bayesian Networks

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18

CSCI 5822 Probabilistic Models of Human and Machine Learning

Exact Inference ..

Graduate School of Information Sciences, Tohoku University

Class #16 – Tuesday, October 26

Lecture 3: Exact Inference in GMs

Clique Tree Algorithm: Computation

Junction Trees 3 Undirected Graphical Models

Hidden Markov Models ..

Mean Field and Variational Methods Loopy Belief Propagation

Presentation transcript:

Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo

Outline – Part I ● Belief propagation example ● Propagation using message passing ● Clique trees ● Junction trees ● Junction trees algorithm

Simple belief propagation example (from Jensen, “An introduction to Bayesian Networks” P(X Icy ): yes0.7 no0.3 P(X Holmes |X Icy ): yesno yes No P(X Watson |X Icy ): yesno yes no “Icy Roads”

“Watson has had an accident!”  P(X Watson =yes)=1 Bayes’ Rule   P(X Icy | X Watson =yes) = (0.95,0.05) (0.70,0.30) a priori ? Joint Probability + Marginalization   P(X Holmes | X Watson =yes) = (0.76,0.24) (0.59,0.41) a priori

“No, the roads are not icy.”  P(X Icy =no)=1 When initiating X Icy X Holmes becomes independent of X Watson ; X Holmes X Watson | X Icy 

Answering probabilistic queries (J. Pearl, [1]) ● Joint probability using elimination – most likely that human brain does not do that! Why? – Needs to hold all the network to set the elimination order – Answers only single question, without answering on all questions – Create and calculate spurious dependencies among vars concieved as independent – Sequential! ● Our brain probably computes it in parallel

Belief updating as a constraint propagation (J. Pearl, [1]) ● Local, simple computations ● But is it possible at all? – Why would it ever stabilize ● Rumour example: You updated your nabour, after several days you hear the same from him. Should it increase your belief? Graph coloring

Simple example for chain propagation (J. Pearl, [1]) Definitions: X Y e X Y Z e Link matrix Vector

Bidirectional propagation (J. Pearl, [1]) XTU X Y Z e-e- e+e+ π(u) π(t) π(x) λ(y)λ(x) λ(z) Chooses column Chooses row π(t) λ(y)

HMM and Backward-Forward algorithm P(x 1,…,x L,h i ) = P(x 1,…,x i,h i ) P(x i+1,…,x L | x 1,…,x i,h i ) H1H1 H2H2 H L-1 HLHL X1X1 X2X2 X L-1 XLXL HiHi XiXi = P(x 1,…,x i,h i ) P(x i+1,…,x L | h i )  f(h i ) b(h i ) Belief update: P(h i | x 1,…,x L ) = (1/K) P(x 1,…,x L,h i ) where K=  hi P(x 1,…,x L,h i ). π(h i ) = P(x 1,…,x i,h i ) P(x i+1,…,x L | h i )  f(h i ) b(h i ) λ(h i )

The forward algorithm H1H1 H2H2 X1X1 X2X2 HiHi XiXi The task: Compute f(h i ) = P(x 1,…,x i,h i ) for i=1,…,L (namely, considering evidence up to time slot i). P(x 1, h 1 ) = P(h 1 ) P(x 1 |h 1 ) {Basis step} P(x 1,…,x i,h i ) =  P(x 1,…,x i-1, h i-1 ) P(h i | h i-1 ) P(x i | h i ) h i-1 {step i} π(h i-1 )

The backward algorithm The task: Compute b(h i ) = P(x i+1,…,x L |h i ) for i=L-1,…,1 (namely, considering evidence after time slot i). H L-1 HLHL X L-1 XLXL HiHi H i+1 X i+1 P(x i+1,…,x L |h i ) =  P(h i+1 | h i ) P(x i+1 | h i+1 ) P(x i+2,…,x L | h i+1 ) h i+1 {step i} =b(h i )= =b(h i+1 )=

Can we generalize this approach to any graph? ● Loops pose a problem – We might reach contradiction or indefinite loop ● We should apply clustering and create tree of clusters ● Each new vertex in cluster tree has potential Ψ (mapping all combination of cluster variables to non-negative real number. Joint distribution table is a special case) ● Problems: – Many ways to create cluster (e.g. all vertices forming a loop) – How to obtain marginal probabilities from potentials

● Yet another representation of joint probability ● How we build them: – For every variable A there should exist single clique V that – Clique potential is a multiplication of all its tables (a table is multiplied only if it was not used in another clique) – Links are labeled with separators, which consist of the intersection of adjacent nodes – Separator tables are initialized to ones ● Claim: Joint distribution is a product of all cluster tables divided by product of all separator tables Clique trees

Example A B C D F E A,B,C CDEDEF C DE A B C D F E Chordal graph

Consistency ● The marginals of adjacent nodes on their separator should be equal Ψ(V)Ψ(W) Ψ(S)

Absorption ● Absorption passes a message from one node to another Ψ(V)Ψ*(W) Ψ*(S) Ψ(V)Ψ*(W) Ψ*(S)

Absorption (cont) ● Absorption ensures consistency? ● Product of cluster tables divided by product of separators is invariant under absorption ● This feature maintains the correctness of clique tree representation

Rules of message passing in clique tree ● Node V can send exactly one message to neighbor W, and only if V has received a message from each of its other neighbors ● We continue till messages passed once in both directions along every link After all messages are sent in both directions over every link, the tree is consistent

Does local consistency ensures global consistency? ● The same old loop problem ● Building a tree breaks the loops DBC ABC ED EA Global consistency: B D A C E

Junction tree ● Ensures global consistency ● Definition: Clique tree is a junction tree if all nodes on the path between V and W contain V∩W ABE EH BCF FIFJ CDG GK E B F F F C G ACDB HJKI FGE

Claims on junction tree ● Claim: A consistent junction tree is globally consistent ● Claim: t u is a product of all node potentials divided by the product of all separator potentials. Then ● Claim: after a full round of message passing in T, ● Claim: given evidence at different nodes, after a full round of message passing in T,

References until now 1. J. Pearl, “Probabilistic Reasoning In Intellihent Systems” 2. Finn.V. Jensen, “An Introduction To Bayesian Networks” 3. Presentations by: Sam Roweis, Henrik Bengtsson, David Barber