Tree Decomposition methods Chapter 9 ICS-275 Spring 2009.

Slides:

Advertisements

Similar presentations

Tree Clustering for Constraint Networks 1 Chris Reeson Advanced Constraint Processing Fall 2009 By Rina Dechter & Judea Pearl Artificial Intelligence,

Advertisements

Constraint Satisfaction Problems

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

1 Finite Constraint Domains. 2 u Constraint satisfaction problems (CSP) u A backtracking solver u Node and arc consistency u Bounds consistency u Generalized.

Lauritzen-Spiegelhalter Algorithm

Introduction to Algorithms

Statistical Methods in AI/ML Bucket elimination Vibhav Gogate.

Anagh Lal Tuesday, April 08, Chapter 9 – Tree Decomposition Methods- Part II Anagh Lal CSCE Advanced Constraint Processing.

Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.

Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.

Exact Inference in Bayes Nets

Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.

Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.

Foundations of Constraint Processing, Spring 2008 April 16, 2008 Tree-Structured CSPs1 Foundations of Constraint Processing CSCE421/821, Spring 2008:

Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.

Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.

From Variable Elimination to Junction Trees

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

1 Directional consistency Chapter 4 ICS-179 Spring 2010 ICS Graphical models.

Anagh Lal Monday, April 14, Chapter 9 – Tree Decomposition Methods Anagh Lal CSCE Advanced Constraint Processing.

Global Approximate Inference Eran Segal Weizmann Institute.

1 Directional consistency Chapter 4 ICS-275 Spring 2007.

Bayesian Networks Clique tree algorithm Presented by Sergey Vichik.

1 Hybrid of search and inference: time- space tradeoffs chapter 10 ICS-275 Spring 2007.

Belief Propagation, Junction Trees, and Factor Graphs

1 Consistency algorithms Chapter 3. Spring 2007 ICS 275A - Constraint Networks 2 Consistency methods Approximation of inference: Arc, path and i-consistecy.

M. HardojoFriday, February 14, 2003 Directional Consistency Dechter, Chapter 4 1.Section 4.4: Width vs. Local Consistency Width-1 problems: DAC Width-2.

Tree Decomposition methods Chapter 9 ICS-275 Spring 2007.

Tutorial #9 by Ma’ayan Fishelson

1 Hybrid of search and inference: time- space tradeoffs chapter 10 ICS-275A Fall 2003.

Exact Inference: Clique Trees

AND/OR Search for Mixed Networks #CSP Robert Mateescu ICS280 Spring Current Topics in Graphical Models Professor Rina Dechter.

Inference in Gaussian and Hybrid Bayesian Networks ICS 275B.

1 Book: Constraint Processing Author: Rina Dechter Chapter 10: Hybrids of Search and Inference Time-Space Tradeoffs Zheying Jane Yang Instructor: Dr. Berthe.

1 ELEC692 Fall 2004 Lecture 1b ELEC692 Lecture 1a Introduction to graph theory and algorithm.

Because the localized R(*,m)C does not consider combinations of relations across clusters, propagation between clusters is hindered. Synthesizing a global.

Constraint Satisfaction Read Chapter 5. Model Finite set of variables: X1,…Xn Variable Xi has values in domain Di. Constraints C1…Cm. A constraint specifies.

Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.

Two Approximate Algorithms for Belief Updating Mini-Clustering - MC Robert Mateescu, Rina Dechter, Kalev Kask. "Tree Approximation for Belief Updating",

1 Directional consistency Chapter 4 ICS-275 Spring 2009 ICS Constraint Networks.

A Logic of Partially Satisfied Constraints Nic Wilson Cork Constraint Computation Centre Computer Science, UCC.

Wednesday, January 29, 2003CSCE Spring 2003 B.Y. Choueiry Directional Consistency Chapter 4.

Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.

Computing Branchwidth via Efficient Triangulations and Blocks Authors: F.V. Fomin, F. Mazoit, I. Todinca Presented by: Elif Kolotoglu, ISE, Texas A&M University.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.

Foundations of Constraint Processing, Spring 2009 Structure-Based Methods: An Introduction 1 Foundations of Constraint Processing CSCE421/821, Spring 2009.

Christopher M. Bishop, Pattern Recognition and Machine Learning 1.

1 Tutorial #9 by Ma’ayan Fishelson. 2 Bucket Elimination Algorithm An algorithm for performing inference in a Bayesian network. Similar algorithms can.

Today Graphical Models Representing conditional dependence graphically

Algorithms for hard problems Parameterized complexity Bounded tree width approaches Juris Viksna, 2015.

Structure-Based Methods Foundations of Constraint Processing

The minimum cost flow problem

Algorithms and networks

A Distributed Bucket Elimination Algorithm

Structure-Based Methods Foundations of Constraint Processing

Algorithms and networks

Structure-Based Methods Foundations of Constraint Processing

Boi Faltings and Martin Charles Golumbic

Structure-Based Methods Foundations of Constraint Processing

Boi Faltings and Martin Charles Golumbic

Advanced consistency methods Chapter 8

Directional consistency Chapter 4

Structure-Based Methods Foundations of Constraint Processing

An Introduction Structure-Based Methods

Structure-Based Methods Foundations of Constraint Processing

Iterative Join Graph Propagation

Consistency algorithms

Presentation transcript:

Tree Decomposition methods Chapter 9 ICS-275 Spring 2009

Graph concepts reviews: Hyper graphs and dual graphs A hypergraph is H = (V,S), V= {v_1,..,v_n} and a set of subsets Hyperegdes: S={S_1,..., S_l }. Dual graphs of a hypergaph: The nodes are the hyperedges and a pair of nodes is connected if they share vertices in V. The arc is labeled by the shared vertices. A primal graph of a hypergraph H = (V,S) has $V$ as its nodes, and any two nodes are connected by an arc if they appear in the same hyperedge. if all the constraints of a network R are binary, then its hypergraph is identical to its primal graph.

Acyclic Networks The running intersection property (connectedness): An arc can be removed from the dual graph if the variables labeling the arcs are shared along an alternative path between the two endpoints. Join graph: An arc subgraph of the dual graph that satisfies the connectedness property. Join-tree: a join-graph with no cycles Hypertree: A hypergraph whose dual graph has a join-tree. Acyclic network: is one whose hypergraph is a hypertree.

Solving acyclic networks Algorithm acyclic-solving applies a tree algorithm to the join-tree. It applies (a little more than) directional relational arc-consistency from leaves to root. Complexity: acyclic-solving is O(r l log l) steps, where r is the number of constraints and l bounds the number of tuples in each constraint relation (It assumes that join of two relations when one’s scope is a subset of the other can be done in linear time)

Example Constraints are: R_{ABC} = R_{AEF} = R_{CDE} = {(0,0,1) (0,1,0)(1,0,0)} and R_{ACE} = { (1,1,0) (0,1,1) (1,0,1) }. d= (R_{ACE},R_{CDE},R_{AEF},R_{ABC}). When processing R_{ABC}, its parent relation is R_{ACE}; we generate \pi_{ACE} (R_{ACE} x R_{ABC}), yielding R_{ACE} = {(0,1,1)(1,0,1)}. processing R_{AEF} we generate relation R_{ACE} = \pi_{ACE} ( R_{ACE} x R_{AEF} ) = {(0,1,1)}. processing R_{CDE} we generate: R_{ACE} = \pi_{ACE} ( R_{ACE} x R_{CDE} ) = {(0,1,1)}. A solution is generated by picking the only allowed tuple for R_{ACE}, A=0,C=1,E=1, extending it with a value for D that satisfies R_{CDE}, which is only D=0, and then similarly extending the assignment to F=0 and B=0, to satisfy R_{AEF} and R_{ABC}.

Example Constraints are: R_{ABC} = R_{AEF} = R_{CDE} = {(0,0,1) (0,1,0)(1,0,0)} R_{ACE} = { (1,1,0) (0,1,1) (1,0,1) }. d= (R_{ACE},R_{CDE},R_{AEF},R_{ABC}). When processing R_{ABC}, its parent relation is R_{ACE}; processing R_{AEF} we generate relation processing R_{CDE} we generate: R_{ACE} = \pi_{ACE} ( R_{ACE} x R_{CDE} ) = {(0,1,1)}. A solution is generated by picking the only allowed tuple for R_{ACE}, A=0,C=1,E=1, extending it with a value for D that satisfies R_{CDE}, which is only D=0, and then similarly extending the assignment to F=0 and B=0, to satisfy R_{AEF} and R_{ABC}.

Recognizing acyclic networks Dual-based recognition: perform maximal spanning tree over the dual graph and check connectedness of the resulting tree. Dual-acyclicity complexity is O(e^3) Primal-based recognition: Theorem (Maier 83): A hypergraph has a join-tree iff its primal graph is chordal and conformal. A chordal primal graph is conformal relative to a constraint hypergraph iff there is a one-to-one mapping between maximal cliques and scopes of constraints.

Primal-based recognition Check cordality using max-cardinality ordering. Test conformality Create a join-tree: connect every clicque to an earlier clique sharing maximal number of variables

Tree-based clustering Convert a constraint problem to an acyclic-one: group subset of constraints to clusters until we get an acyclic problem. Hypertree embedding of a hypergraph H = (X,H) is a hypertree S = (X, S) s.t., for every h in H there is h_1 in S s.t. h is included in h_1. This yield algorithm join-tree clustering Hypertree partitioning of a hypergraph H = (X,H) is a hypertree S = (X, S) s.t., for every h in H there is h_1 in S s.t. h is included in h_1 and X is the union of scopes in h_1.

Join-tree clustering Input: A constraint problem R =(X,D,C) and its primal graph G = (X,E). Output: An equivalent acyclic constraint problem and its join-tree: T= (X,D, {C ‘}) 1. Select an d = (x_1,...,x_n) 2. Triangulation(create the induced graph along $d$ and call it G^*: ) for j=n to 1 by -1 do E  E U {(i,k)| (i,j) in E,(k,j) in E } 3. Create a join-tree of the induced graph G^*: a. Identify all maximal cliques (each variable and its parents is a clique). Let C_1,...,C_t be all such cliques, b. Create a tree-structure T over the cliques: Connect each C_{i} to a C_{j} (j < I) with whom it shares largest subset of variables. 4. Place each input constraint in one clique containing its scope, and let P_i be the constraint subproblem associated with C_i. 5. Solve P_i and let {R'}_i $ be its set of solutions. 6. Return C' = {R'}_1,..., {R'}_t the new set of constraints and their join-tree, T. Theorem: join-tree clustering transforms a constraint network into an acyclic network

Example of tree-clustering

Complexity of JTC THEOREM 5 (complexity of JTC) join-tree clustering is O(r k^ (w*(d)+1)) time and O(nk^(w*(d)+1)) space, where k is the maximum domain size and w*(d) is the induced width of the ordered graph. The complexity of acyclic-solving is O(n w*(d) (log k) k^(w*(d)+1))

Let R= be a CSP problem. A tree decomposition for R is, such that T=(V,E) is a tree  associates a set of variables  (v)  X with each node v  associates a set of functions  (v)  C with each node v such that 1.  R i  C, there is exactly one v such that R i  (v) and scope(R i )  (v). 2.  x  X, the set {v  V|x  (v)} induces a connected subtree. Unifying tree-decompositions

Let R= be CSP problem. A tree decomposition is, such that T=(V,E) is a tree  associates a set of variables  (v)  X with each node  associates a set of functions  (v)  C with each node such that 1.  R i  C, there is exactly one v such that R i  (v) and scope(R i )  (v). 1a.  v,  (v)  scope(  (v)). 2.  x  X, the set {v  V|x  (v)} induces a connected subtree. HyperTree Decomposition w (tree-width) = max v  V |  (v)| hw (hypertree width) = maxv  V |  (v)| sep (max separator size) = max (u,v) |sep(u,v)|

Example of two join-trees again hyperTree- decomposition Tree decomposition

Cluster Tree Elimination Cluster Tree Elimination (CTE) works by passing messages along a tree-decomposition Basic idea: Each node sends one message to each of its neighbors Node u sends a message to its neighbor v only when u received messages from all its other neighbors

Constraint Propagation u v x1x1 x2x2 xnxn m(u,v)

Example of CTE message propagation

Each function in a cluster Satisfy running intersection property Tree-width: number of variables in a cluster-1 Equals induced- width G E F CD B A A B C R(a), R(b,a), R(c,a,b) B C D F R(d,b), R(f,c,d) B E F R(e,b,f) E F G R(g,e,f) EF BC BF Join-Tree Decomposition (Dechter and Pearl 1989)

A B C R(a), R(b,a), R(c,a,b) B C D F R(d,b), R(f,c,d),h (1,2) (b,c) B E F R(e,b,f), h (2,3) (b,f) E F G R(g,e,f) EF BC BF sep(2,3)={B,F} elim(2,3)={C,D} Cluster Tree Elimination joinproject

ABC BEF EFG EF BF BC BCDF Time: O ( exp(w*+1 )) Space: O ( exp(sep)) Time: O(exp(hw)) ( Gottlob et. Al., 2000 ) CTE: Cluster Tree Elimination G E F CD B A

Decomposable subproblems DEFINITION 8 (decomposable subproblem) Given a constraint problem R = (X;D;C) and a subset of variables Y µ X, a subproblem over Y, RY = (Y;DY ;CY ), is decomposable relative to R iff sol(RY ) = ¼Y sol(R) where sol(R) is the set of all solutions of network R

Implementation tradeoffs Implementing Equation 1: The particular implementation of equation (1) in CTE can vary. One option is to generate the combined relation (1Ri2clusterv(u) Ri) before sending messages to neighbor v. The other option, which we assume here, is that the message sent to each neighbor is created without recording the relation (1Ri2clusterv(u) Ri). Rather, each tuple in the join is projected on the separator immediately after being created. This will yields a better memory utilization. Furthermore, when u sends a message to v its cluster may contain the message it received from v. Thus in a synchronized message passing we can allow a single enumeration of the tuples in cluster(u) when the messages are sent back towards the leaves, each of which be projected in parallel on the separators of the outgoing messages. The output of CTE is the original tree-decomposition where

Cluster Tree Elimination - properties Correctness and completeness: Algorithm CTE is correct, i.e. it computes the exact joint probability of every single variable and the evidence. Time complexity: O ( deg  (n+N)  d w*+1 ) Space complexity: O ( N  d sep ) wheredeg = the maximum degree of a node n = number of variables (= number of CPTs) N = number of nodes in the tree decomposition d = the maximum domain size of a variable w* = the induced width sep = the separator size Time and space by hyperwidth only if hypertree decomposition: time and O(N t^w*) space,

Cluster Tree Elimination - properties Correctness and completeness: Algorithm CTE is sound and complete for generating minimal subproblems over chi(v) for every v: i.e. the solution of each subproblem is minimal. Time complexity: O ( deg  (r+N)  k w*+1 ) Space complexity: O ( N  d sep ) wheredeg = the maximum degree of a node r = number of of CPTs N = number of nodes in the tree decomposition k= the maximum domain size of a variable w* = the induced width sep = the separator size JTC is O ( r  k w*+1 ) time and space

Cluster-Tree Elimination (CTE)... m (v 1,u) m (v 2,u) m (v i,u)  (u)  (u) m (u,w) =  sep(u,w) [  j {m (v j,u) }   (u)]

A AC AB BCF ABD DFG Distributed relational arc-consistency example A BC DF G The message that R2 sends to R1 is R1 updates its relation and domains and sends messages to neighbors

Distributed Arc-Consistency b) Constraint network DR-AC can be applied to the dual problem of any constraint network. A ABAC ABDBCF DFG A AB A A A B C B D F A

A AC AB BCF ABD DFG A AC ABDBCF DFG B B D F A A A C 1 DR-AC on a dual join-graph

A AC AB BCF ABD DFG A A A A 1 3 B B A D 1 2 F 1 3 D C 2 B C 2 F 1 3 A B Iteration 1 A ABAC ABDBCF DFG B B D F A A A C 1

A 1 3 AC AB BCF ABD DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 1

A 1 3 AC AB BCF ABD DFG 213 A 1 3 A 1 3 A A 1 3 A 1 3 D 2 F 1 3 D 1 2 C 2 B 1 3 C 2 F 1 A B 1 3 B A ABAC ABDBCF DFG B B D F A A A C 1 Iteration B

A 1 3 AC AB BCF 321 ABD DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1

A 1 3 A 1 3 A 1 3 B 3 A 1 3 D 2 F 1 D 2 C 2 B 1 3 C 2 F 1 A 1 3 A 1 3 A 1 3 AC AB BCF 321 ABD DFG 213 B 1 3 B 1 3 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 3

A 1 3 AC AB 13 BCF 321 ABD DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 3

A 1 3 A 1 3 A 1 B 3 A 1 3 D 2 F 1 D 2 C 2 B 3 C 2 F 1 A 1 3 A 1 3 A 1 3 AC AB 13 BCF 321 ABD DFG 213 B 1 3 B 1 3 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 4

A 1 AC AB 13 BCF 321 ABD 132 DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 4

A 1 A 1 A 1 B 3 B 3 B 3 A 1 D 2 F 1 D 2 C 2 B 3 C 2 F 1 A 1 A 1 A 1 AC AB 13 BCF 321 ABD 132 DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 5

A 1 AC 12 AB 13 BCF 321 ABD 132 DFG 213 A ABAC ABDBCF DFG B B D F A A A C 1 Iteration 5

Join-tree clustering is a restricted tree- decomposition: example

Adaptive-consistency as tree-decomposition Adaptive consistency is a message-passing along a bucket-tree Bucket trees: each bucket is a node and it is connected to a bucket to which its message is sent. The variables are the clicue of the triangulated graph The funcions are those placed in the initial partition

Bucket Elimination Adaptive Consistency (Dechter and Pearl, 1987) A E D C B E A D C B || R D BE, || R E || R DB || R DCB || R ACB || R AB RARA R C BE

From bucket-elimination to bucket-tree propagation

The bottom up messages

Adaptive-Tree-Consistency as tree-decomposition Adaptive consistency is a message-passing along a bucket-tree Bucket trees: each bucket is a node and it is connected to a bucket to which its message is sent. Theorem: A bucket-tree is a tree-decomposition Therefore, CTE adds a bottom-up message passing to bucket-elimination. The complexity of ATC is O(r deg k^(w*+1))

A ABDE FGI ABC BCE GHIJ CDEF FGH C H A C AABBC BE C C DECE F H F FGGHH GI A ABDE FGI ABC BCE GHIJ CDEF FGH C H A ABBC C DECE H F FGH GI ABCDE FGI BCE GHIJ CDEF FGH BC DECE F FGH GI ABCDE FGHIGHIJ CDEF CDE F GHI more accuracy less complexity Join-Graphs and Iterative Join-graph propagation (IJGP) G E F CD B A