Exact Inference Continued

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
. Exact Inference in Bayesian Networks Lecture 9.
Lauritzen-Spiegelhalter Algorithm
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Exact Inference in Bayes Nets
Wavelength Assignment in Optical Network Design Team 6: Lisa Zhang (Mentor) Brendan Farrell, Yi Huang, Mark Iwen, Ting Wang, Jintong Zheng Progress Report.
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Dynamic Bayesian Networks (DBNs)
Loopy Belief Propagation a summary. What is inference? Given: –Observabled variables Y –Hidden variables X –Some model of P(X,Y) We want to make some.
CS774. Markov Random Field : Theory and Application Lecture 17 Kyomin Jung KAIST Nov
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Optimization of Pearl’s Method of Conditioning and Greedy-Like Approximation Algorithm for the Vertex Feedback Set Problem Authors: Ann Becker and Dan.
Algorithm Strategies Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Recent Development on Elimination Ordering Group 1.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
. Bayesian Networks For Genetic Linkage Analysis Lecture #7.
Vertex Cover, Dominating set, Clique, Independent set
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Coloring Algorithms and Networks. Coloring2 Graph coloring Vertex coloring: –Function f: V  C, such that for all {v,w}  E: f(v)  f(w) Chromatic number.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Backtracking.
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Fixed Parameter Complexity Algorithms and Networks.
APPROXIMATION ALGORITHMS VERTEX COVER – MAX CUT PROBLEMS
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
COSC 2007 Data Structures II Chapter 14 Graphs III.
Spanning Trees Introduction to Spanning Trees AQR MRS. BANKS Original Source: Prof. Roger Crawfis from Ohio State University.
Spanning Trees Introduction to Spanning Trees AQR MRS. BANKS Original Source: Prof. Roger Crawfis from Ohio State University.
Module 5 – Networks and Decision Mathematics Chapter 23 – Undirected Graphs.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Daphne Koller Variable Elimination Graph-Based Perspective Probabilistic Graphical Models Inference.
Data Structures & Algorithms Graphs
Minimum Spanning Trees CS 146 Prof. Sin-Min Lee Regina Wang.
WK15. Vertex Cover and Approximation Algorithm By Lin, Jr-Shiun Choi, Jae Sung.
Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Today Graphical Models Representing conditional dependence graphically
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Distributed cooperation and coordination using the Max-Sum algorithm
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.
Algorithms and Networks
Inference in Bayesian Networks
Exact Inference ..
Vertex Cover, Dominating set, Clique, Independent set
Exact Inference Continued
I206: Lecture 15: Graphs Marti Hearst Spring 2012.
Markov Networks.
CSCI 5822 Probabilistic Models of Human and Machine Learning
Exact Inference ..
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Inference III: Approximate Inference
EE5900 Advanced Embedded System For Smart Infrastructure
Lecture 3: Exact Inference in GMs
Markov Networks.
Variable Elimination Graphical Models – Carlos Guestrin
Presentation transcript:

Exact Inference Continued .

Variable Elimination in Graphical Models p(t|v) V S L T A B X D p(x|a) p(d|a,b) p(a|t,l) p(b|s) p(l|s) p(s) p(v) g1(t,v) V S L T A B X D g4(x,a) g3(d,a,b) g2(a,t,l) g (l,b,s)=p(b|s)p(l,s) P(a,…,x) = P(v,t) P(a|t,l) P(d|a,b) P(x|a) P(b|s) P(l|s)P(s) P(a,…,x) = K g1(v,t) g2(a,t,l) g3(d,a,b) g4(x,a) g5(b,s) g6(l,s)

A Graph-Theoretic View Eliminating vertex v from an undirected graph G – the process of making NG(v) a clique and removing v and its incident edges from G. NG(v) is the set of vertices that are adjacent to v in G. Elimination sequence of G – an order of all vertices.

Treewidth The width w of an elimination sequence s is the size of the largest clique (minus 1) being formed in the elimination process, namely, ws = maximumv|NG(v)|. The treewidth tw of a graph G is the minimum width among all elimination sequences, namely, tw=minimums ws S f a c d e b Examples. All trees have tw = 1, All graphs with isolated cycles have tw = 2, cliques of size n have tw=n-1.

Order 1: “Corners first” Another Example x1 x4 x2 x5 x3 x6 x7 x8 x9 Order 1: “Corners first” Largest clique size 4 Width=3 Order 2: x2, x5,… Largest clique size 5 Width=4 x1 x4 x2 x5 x3 x6 x7 x8 x9

Results about treewidth Theorem. Finding an elimination sequence that produces the treewidth k, or more precisely just finding if k = c is NP-hard. Simple Greedy heuristic. At each step eliminate a vertex v that produces the smallest clique, namely, minimizes |NG(v)|.

Finding Good Elimination Order Repeat until the graph becomes empty Compute for each variable |NG(v)|. Choose a vertex v among the k lowest at random (flip a coin according to |NG(v)|) 3. Eliminate vertex v from the graph (make its neighbors a clique) Repeat these steps until using 5% of the estimated time needed to solve the inference problem with the elimination order found so far (estimated by the sum of state space size of all cliques).

Results about treewidth Sequence of Theorems: There are several algorithms that produce treewidth tw with a small constant factor error  at time complexity of Poly(n)ctw, where c is a constant and n is the number of vertices. Main idea. Find a vertex minimum (A,B) cutset S (error by some factor). Make S a clique. Solve recursively G[A,S] and G[B,S]. Observation. The above theorem is “practical” if the constants  and c are low enough because computing posterior belief also requires complexity of at most Poly(n)ktw where k is the size of the largest variable domain.

Elimination Sequence with Weights There is a need for cost functions for optimizing time complexity that take the number of states into account. Elimination sequence of a weighted graph G – an order of the vertices of G, written as Xα= (Xα(1) ,…,Xα(n) ), where α is a permutation on {1,…,n}. The cost of eliminating vertex v from a graph Gi is the product of weights of the vertices in NGi(v).

Elimination Sequence with Weights The residual graph Gi is the graph obtained from Gi-1 by eliminating vertex Xα(i-1). (G1≡G). The cost of an elimination sequence Xα is the sum of costs of eliminating Xα(i) from Gi, for all i.

Example Weights of vertices (#of states): yellow nodes: w = 2 Original Bayes network. V S T L A B D X Weights of vertices (#of states): yellow nodes: w = 2 blue nodes: w = 4 V S T L A B D X Undirected graph Representation, called the moral graph. I-map of the original

Example Suppose the elimination sequence is Xα=(V,B,S,…): G1 G2 G3 V S D X G2 S T L A B D X G3 S T L A D X

Finding Good Elimination Order Optimal elimination sequence: one with minimal cost. NP-complete. Repeat until the graph becomes empty Compute the elimination cost of each variable in the current graph. Choose a vertex v among the k lowest at random (flip a coin according to their current elimination costs). 3. Eliminate vertex v from the graph (make its neighbors a clique) Repeat these steps until using 5% of the estimated time needed to solve the inference problem with the elimination order found so far (estimated by the sum of state space size of all cliques).

Global conditioning Fixing value of A & B a a b b A B C D C D E E I J K K L L M M This transformation yields an I-map of Prob(a,b,C,D…) for fixed values of A and B. Fixing values in the beginning of the summation can decrease tables formed by variable elimination. This way space is traded with time. Special case: choose to fix a set of nodes that “break all loops”. This method is called cutset-conditioning.

Cuteset conditioning Fixing value of A & B & L breaks all loops. We remain with solving a tree. But can we choose less variables to break all loops ? Are there better variables to choose than others ? A L C I D J B E K M This optimization question translates to the well known WVFS problem: Choose a set of variables of least weight that lie on every cycle of a given weighted undirected graph G.

Optimization The weight w of a node v is defined by: L C I D J B E K M The weight w of a node v is defined by: w(v)= Log(|Dom(V)|). The problem is to minimize the sum of w(v) of all v in the selected cutset. Solution idea (Factor 2). Remove a vertex with minimum w(v)/d(v). Update neighboring weights by w(u)-w(u)/d(u). Repeat until all cycles are gone. Make the set minimal.

Summary Variable elimination: find an order that minimizes the width. The optimal is called treewidth. Complexity of inference grows exponentially in tw. Treewith is smallest in trees and maximal in cliques. Cutset conditioning: find an order that minimizes the cutset size/weight. Complexity of inference grows exponentially in cutset size. Cutset is smallest in trees and maximal in cliques. Example: Small loops connected in a chain. Inference is exponential using the second method but polynomial using the first method.

Extra Slides If time allows

Local Conditional Table Noisy Or-Gate Model

Belief Update in Poly-Trees

Belief Update in Poly-Trees

Approximate Inference Loopy belief propagation Gibbs sampling Bounded conditioning Likelihood Weighting Variational methods

Loopy Belief Propagation in DAGs ( Iterate the messages as if it is a tree) WRONG