Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Slides:



Advertisements
Similar presentations
Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Advertisements

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Dynamic Bayesian Networks (DBNs)
Loopy Belief Propagation a summary. What is inference? Given: –Observabled variables Y –Hidden variables X –Some model of P(X,Y) We want to make some.
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Graphical models: approximate inference and learning CA6b, lecture 5.
Markov Networks.
Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
Machine Learning CUNY Graduate Center Lecture 6: Junction Tree Algorithm.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Bayesian network inference
… Hidden Markov Models Markov assumption: Transition model:
Midterm Review. The Midterm Everything we have talked about so far Stuff from HW I won’t ask you to do as complicated calculations as the HW Don’t need.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Inference in Bayesian Nets
Global Approximate Inference Eran Segal Weizmann Institute.
Temporal Processes Eran Segal Weizmann Institute.
December Marginal and Joint Beliefs in BN1 A Hybrid Algorithm to Compute Marginal and Joint Beliefs in Bayesian Networks and its complexity Mark.
Conditional Random Fields
Propagation in Poly Trees Given a Bayesian Network BN = {G, JDP} JDP(a,b,c,d,e) = p(a)*p(b|a)*p(c|e,b)*p(d)*p(e|d) a d b e c.
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
Exact Inference: Clique Trees
Bayesian Networks Alan Ritter.
Computer vision: models, learning and inference
CSC2535 Spring 2013 Lecture 2a: Inference in factor graphs Geoffrey Hinton.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
GraphLab: how I understood it with sample code Aapo Kyrola, Carnegie Mellon Univ. Oct 1, 2009.
Belief Propagation. What is Belief Propagation (BP)? BP is a specific instance of a general class of methods that exist for approximate inference in Bayes.
Direct Message Passing for Hybrid Bayesian Networks Wei Sun, PhD Assistant Research Professor SFL, C4I Center, SEOR Dept. George Mason University, 2009.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Belief Propagation and its Generalizations Shane Oldenburger.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Pattern Recognition and Machine Learning
Machine Learning – Lecture 18
Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.
Today Graphical Models Representing conditional dependence graphically
Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Bayesian Belief Propagation for Image Understanding David Rosenberg.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning Feb 2008.
Introduction of BP & TRW-S
Inference in Bayesian Networks
Today.
Markov Networks.
CSCI 5822 Probabilistic Models of Human and Machine Learning
CS 188: Artificial Intelligence Fall 2008
Expectation-Maximization & Belief Propagation
Approximate Inference by Sampling
Lecture 3: Exact Inference in GMs
Clique Tree Algorithm: Computation
Junction Trees 3 Undirected Graphical Models
Markov Networks.
Variable Elimination Graphical Models – Carlos Guestrin
Mean Field and Variational Methods Loopy Belief Propagation
Presentation transcript:

Exact Inference

Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out nuisance variables In general inference in GMs is intractable… – Tractable in certain cases, e.g. HMMs, trees – Approximate inference techniques Active research area… – More later

Summing Out A Variable From a Factor

Factor Product

Belief Propagation: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for each one in turn is inefficient Solution: Belief Propagation – Same idea as Forward-backward for HMMs

Belief Propagation Previously: Forward-backward algorithm – Exactly computes posterior marginals P(h_i|V) for chain-structured graphical models (e.g. HMMs) Where V are visible variables h_i is the hidden variable at position I Now we will generalize this to arbitrary graphs – Bayesian and Markov Networks – Arbitrary graph structures (not just chains) We’ll just describe the algorithms and omit derivations (K+F book has good coverage)

BP: Initial Assumptions Pairwise MRF: One factor for each variable One factor for each edge Tree-structure models with higher-order cliques later…

Belief Propagation Pick an arbitrary node: call it the root Orient edges away from root (dangle down) Well-defined notion of parent and child 2 phases to BP algorithm: 1.Send messages up to root (collect evidence) 2.Send messages back down from the root (distribute evidence) Generalize forward-backward from chains to trees

Collect to root phase

Collect to root: Details Bottom-up belief state: – Probability of x_t given all the evidence at or below node t in the tree How to compute the bottom up belief state? “messages” from t’s children – Recursively defined based on belief states of children – Summarize what they think t should know about the evidence in their subtrees

Computing the upward belief state Belief state at node t is the normalized product of: – Incoming messages from children – Local evidence

Q: how to compute upward messages? Assume we have computed belief states of children, then message is: Convert beliefs about child (s) into belifs about parent (t) by using the edge potential

Completing the Upward Pass Continue in this way until we reach the root Analogous to forward pass in HMM Can compute the probability of evidence as a side effect Can now pass messages down from root

Computing the belief state for node s Combine the bottom-up belief for node s with a top-down message for t – Top-down message summarizes all the information in the rest of the graph: – v_st+ is all the evidence on the upstream (root) side of the edge s - t

Distribute from Root Send to Root

Computing Beliefs: Combine bottom-up beliefs with top-down messages

Q: how to compute top-down messages? Consider the message from t to s Suppose t’s parent is r t’s children are s and u (like in the figure)

Q: how to compute top-down messages? We want the message to include all the information t has received except information that s sent it

Sum-product algorithm Really just the same thing Rather than dividing, plug in the definition of node t’s belief to get: Multiply together all messages coming into t – except message recipient node (s)

Parallel BP So far we described the “serial” version – This is optimal for tree-structured GMs – Natural extension of forward-backward Can also do in parallel – All nodes receive messages from their neighbors in parallel – Initialize messages to all 1’s – Each node absorbs messages from all it’s neighbors – Each node sends messages to each of it’s neighbors Converges to the correct posterior marginal

Loopy BP Approach to “approximate inference” BP is only guaranteed to give the correct answer on tree-structured graphs But, can run it on graphs with loops, and it gives an approximate answer – Sometimes doesn’t converge

Generalized Distributive Law Abstractly VE can be thought of as computing the following expression: – Where visible variables are clamped and not summed over – Intermediate results are cached and not re- computed

Generalized Distributive Law Other important task: MAP inference – Essentially the same algorithm can be used – Just replace sum with max (also traceback step)

Generalized Distributive Law In general VE can be applied to any commutative semi-ring – A set K, together with two binary operations called “+” and “×” which satisfy the axioms: The operation “+” is associative and commutative There is an additive identity “0” – k + 0 = k The operation “×” is associative and commutative There is a multiplicative identity “1” – k × 1 = k The distributive law holds: – (a × b) + (a × c) = a × (b + c)

Generalized Distributive Law Semi-ring For marginal inference (sum- product): – “×” = multiplication – “+” = sum Semi-ring For MAP inference (max-product): – “×” = multiplication – “+” = max