Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.

Slides:



Advertisements
Similar presentations
Constraint Satisfaction Problems
Advertisements

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Graphical Models - Inference - Wolfram Burgard, Luc De Raedt, Kristian Kersting, Bernhard Nebel Albert-Ludwigs University Freiburg, Germany PCWP CO HRBP.
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,
. Exact Inference in Bayesian Networks Lecture 9.
Lauritzen-Spiegelhalter Algorithm
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Dynamic Bayesian Networks (DBNs)
Junction Tree Algorithm Brookes Vision Reading Group.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
Junction tree Algorithm :Probabilistic Graphical Models Recitation: 10/04/07 Ramesh Nallapati.
From Variable Elimination to Junction Trees
Machine Learning CUNY Graduate Center Lecture 6: Junction Tree Algorithm.
CSCI 121 Special Topics: Bayesian Networks Lecture #3: Multiply-Connected Graphs and the Junction Tree Algorithm.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Graphical Models - Inference -
Bayesian Networks A causal probabilistic network, or Bayesian network,
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Recent Development on Elimination Ordering Group 1.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Global Approximate Inference Eran Segal Weizmann Institute.
. Bayesian Networks For Genetic Linkage Analysis Lecture #7.
Bayesian Networks Clique tree algorithm Presented by Sergey Vichik.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Belief Propagation, Junction Trees, and Factor Graphs
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Exact Inference: Clique Trees
PGM 2002/03 Tirgul5 Clique/Junction Tree Inference.
1 Minimum Spanning Trees Longin Jan Latecki Temple University based on slides by David Matuszek, UPenn, Rose Hoberman, CMU, Bing Liu, U. of Illinois, Boting.
1 Minimum Spanning Trees Longin Jan Latecki Temple University based on slides by David Matuszek, UPenn, Rose Hoberman, CMU, Bing Liu, U. of Illinois, Boting.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
UIUC CS 598: Section EA Graphical Models Deepak Ramachandran Fall 2004 (Based on slides by Eyal Amir (which were based on slides by Lise Getoor and Alvaro.
Knowledge Representation & Reasoning Lecture #4 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro.
1 12/2/2015 MATH 224 – Discrete Mathematics Formally a graph is just a collection of unordered or ordered pairs, where for example, if {a,b} G if a, b.
An Introduction to Variational Methods for Graphical Models
Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
On Distributing a Bayesian Network
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
Today Graphical Models Representing conditional dependence graphically
Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
Knowledge Representation & Reasoning Lecture #5 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro.
Graph Search Applications, Minimum Spanning Tree
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Inference in Bayesian Networks
PGM 2003/04 Tirgul6 Clique/Junction Tree Inference
Exact Inference Continued
Spanning Trees.
Minimum Spanning Tree.
CSCI 5822 Probabilistic Models of Human and Machine Learning
CSE 373 Data Structures and Algorithms
UIUC CS 497: Section EA Lecture #6
Exact Inference ..
Exact Inference Continued
CSE 373: Data Structures and Algorithms
Lecture 3: Exact Inference in GMs
Clique Tree Algorithm: Computation
Elimination in Chains A B C E D.
主講人:虞台文 大同大學資工所 智慧型多媒體研究室
Graphs.
Variable Elimination Graphical Models – Carlos Guestrin
Presentation transcript:

Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.

a b c d 11 22 33 44 conflict 55 66 77 88 This naive approach of updating the network inherits oscillation problems!

Idea behind the Junction Tree Algorithm a b c d a bc d clever algorithm

a bc de f g h Secondary Structure/ Junction Tree multi-dim. random variables joint probabilities (potentials) Bayesian Network one-dim. random variables conditional probabilities abd ade ace ceg egh def ad aece deeg

Write query in the form Iteratively –Move all irrelevant terms outside of innermost sum –Perform innermost sum, getting a new term –Insert the new term into the product Variable Elimination (General Idea)

Eaxmple of Variable Elimination The “Asia” network: Visit to Asia Smoking Lung Cancer Tuberculosis Abnormality in Chest Bronchitis X-Ray Dyspnea

V S L T A B XD We are interested in P(d) - Need to eliminate: v,s,x,t,l,a,b Initial factors: Brute force:

V S L T A B XD Eliminate variables in order: Initial factors: [ Note: f v (t) = P(t) In general, result of elimination is not necessarily a probability term ]

Eliminate variables in order: Initial factors: V S L T A B XD [ Note: result of elimination may be a function of several variables ]

Eliminate variables in order: Initial factors: V S L T A B XD [ Note: f x (a) = 1 for all values of a ]

Eliminate variables in order: Initial factors: V S L T A B XD

Eliminate variables in order: Initial factors: V S L T A B XD

Eliminate variables in order: Initial factors: V S L T A B XD

Eliminate variables in order: Initial factors: V S L T A B XD

Intermediate factors In our previous example: With a different ordering: V S L T A B XD Complexity is exponential in the size of these factors!

Notes about variable elimination Actual computation is done in the elimination steps Computation depends on the order of elimination For each query we need to compute everything again! –Many redundant calculations

Junction Trees The junction tree algorithm “generalizes” Variable Elimination to avoid redundant calculations The JT algorithm compiles a class of elimination orders into a data structure that supports the computation of all possible queries.

Building a Junction Tree DAG Moral GraphTriangulated GraphJunction TreeIdentifying Cliques

Step 1: Moralization a bc de f g h a bc de f g h a bc de f g h 1. For all w  V: For all u,v  pa(w) add an edge e=u-v. 2. Undirect all edges. GMGM G = ( V, E )

Step 2: Triangulation Add edges to G M such that there is no cycle with length  4 that does not contain a chord. NO YES a bc de f g h a bc de f g h GMGM GTGT

Step 2: Triangulation (cont.) Each elimination ordering triangulates the graph, not necessarily in the same way: A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G A H B D F C E G

Step 2: Triangulation (cont.) Intuitively, triangulations with as few fill-ins as possible are preferred –Leaves us with small cliques (small probability tables) A common heuristic: Repeat until no nodes remain: –Find the node whose elimination would require the least number of fill-ins (may be zero). –Eliminate that node, and note the need for a fill-in edge between any two non-adjacent neighbors. Add the fill-in edges to the original graph.

a bc de f g h a bc d e f g h a bc d e f g a bc d e f a bc d e a b d e aa e a d e vertexinduced added removedclique edges 1hegh- 2gceg- 3fdef- 4cacea-e vertexinduced added removed clique edges 5babda-d 6dade- 7eae- 8aa- GTGT GMGM Eliminate the vertex that requires least number of edges to be added.

Step 3: Junction Graph A junction graph for an undirected graph G is an undirected, labeled graph. The nodes are the cliques in G. If two cliques intersect, they are joined in the junction graph by an edge labeled with their intersection.

a b d a c e de f a de e g h c e g a bc de f g h Bayesian Network G = ( V, E ) a bc de f g h a bc de f g h Moral graph G M Triangulated graph G T abd ade ace ceg egh def adaece de eg seperators Junction graph G J (not complete) e.g. ceg  egh = eg Cliques e e e a e

Step 4: Junction Tree A junction tree is a sub-graph of the junction graph that –Is a tree –Contains all the cliques (spanning tree) –Satisfies the running intersection property: for each pair of nodes U, V, all nodes on the path between U and V contain

Running intersection? All vertices C and sepsets S along the path between any two vertices A and B contain the intersection A  B. abd ade ace ceg egh def ad aece deeg Ex: A={a,b,d}, B={a,c,e}  A  B={a} C={a,d,e}  {a}, S 1 ={a,d}  {a}, S 2 ={a,e}  {a} A B C S1S1 S2S2

A few useful Theorems Theorem: An undirected graph is triangulated if and only if its junction graph has a junction tree Theorem: A sub-tree of the junction graph of a triangulated graph is a junction tree if and only if it is a spanning of maximal weight (sum of number the of variables in the domain of the link).

Junction graph G J (not complete) abd ade ace ceg egh def adaece de eg e e e a e abd ade ace ceg egh def ad aece deeg Junction tree G JT There are several methods to find MST. Kruskal’s algorithm: choose successively a link of maximal weight unless it creates a cycle.

Colorful example Compute the elimination cliques (the order here is f, d, e, c, b, a). Form the complete junction graph over the maximal elimination cliques and find a maximum- weight spanning tree.

Principle of Inference DAG Junction Tree Inconsistent Junction Tree Initialization Consistent Junction Tree Propagation Marginalization

a b d a c e de f a de e g h c e g abd ade ace ceg egh def ad aece deeg sepsets In JT cliques becomes vertices G JT Ex: ceg  egh = eg

Potentials D EFINITION: A potential  A over a set of variables X A is a function that maps each instantiation of x A into a non-negative real number. Ex: A potential  abc over the set of vertices {a,b,c}. X a has four states, and X b and X c has three states. A joint probability is a special case of a potential where   A (x A )=1.

The potentials in the junction tree are not consistent with each other., i.e. if we use marginalization to get the probability distribution for a variable X u we will get different results depending on which clique we use. abd ade ace ceg egh def ad aece deeg P( X a ) =   ade = (0.02, 0.43, 0.31, 0.12) de P( X a ) =   ace = (0.12, 0.33, 0.11, 0.03) ce The potentials might not even sum to one, so they are not joint probability distributions.

Message Passing from clique A to clique B 1. Project the potential of A into S AB 2. Absorb the potential of S AB into B Projection Absorption Propagating potentials

1. C OLLECT -E VIDENCE messages D ISTRIBUTE -E VIDENCE messages6-10 Global Propagation Root abd ade ace ceg egh def ad aece deeg

A priori distribution global propagation  potentials are consistent  Marginalizations gives probability distributions for the variables

Example: Create Join Tree BC AD (this BN corresponds to an HMM with 2 time steps: Junction Tree: B,C A,B C,D B C

Example: Initialization Variable Associated Cluster Potential function AA,B B CB,C DC,D B,C A,B C,D B C BC AD

Example: Collect Evidence Choose arbitrary clique, e.g. B,C, where all potential functions will be collected. Call recursively neighboring cliques for messages: 1. Call A,B: –1. Projection onto B: –2. Absorption:

Example: Collect Evidence (cont.) 2. Call C,D: –1. Projection: –2. Absorption: B,C A,B C,D B C

Example: Distribute Evidence Pass messages recursively to neighboring nodes Pass message from B,C to A,B: –1. Projection: –2. Absorption:

Example: Distribute Evidence (cont.) Pass message from X1,X2 to X2,Y2: –1. Projection: –2. Absorption: B,C A,B C,D B C

Netica’s Animal Characteristics BN

Subnet 1: * JTnode 1: An,En * JTnode 2: An,Sh * JTnode 3: An,Cl Subnet 2: * JTnode 4: Cl,Yo * JTnode 5: Cl,Wa Subnet 3: * JTnode 6: Cl,Bod