Iterative Join Graph Propagation

Iterative Join Graph Propagation
Rina Dechter Kalev Kask Robert Mateescu I am going to talk about Iterative Join Graph Propagation, which is work done by RD, KK and myself UAI 2002 University of California - Irvine

What is IJGP? IJGP is an approximate algorithm for belief updating in Bayesian networks IJGP is a version of join-tree clustering which is both anytime and iterative IJGP applies message passing along a join-graph, rather than a join-tree Empirical evaluation shows that IJGP is almost always superior to other approximate schemes (IBP, MC) Probabilistic reasoning using Bayesian betworks is known to be a hard problem. Although exact algorithms are available for solving this task, they are most of the time infeasible in practice, due to the large size and connectivity of real life networks. Approximate schemes are therefore of crucial importance. We propose here a new algorithm, called IJGP - iterative join graph propagation, which applies message passing along a join-graph. It is both anytime and iterative, providing the user adjustable tradeoffs between accuracy and time. The empirical evaluation we conducted was extremely encouraging, showing that IJGP was better than other existing approximate algorithms on several classes of problems.

Outline IBP - Iterative Belief Propagation MC - Mini Clustering
IJGP - Iterative Join Graph Propagation Empirical evaluation Conclusion Our new algorithm, IJGP, builds on two existing approximate schemes: IBP, which is an iterative algorightm, on one hand, and a recently introduced scheme, Mini-Clustering, which is an anytime algorithm on the other hand. We will quickly review the main characteristics of these two algorithms and then present IJGP.

Iterative Belief Propagation - IBP
Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks No guarantees for convergence Works well for many coding networks U1 U2 U3 One step: update BEL(U1) X1 X2

p(d|b), h(1,2)(b,c), p(f|c,d)
Mini-Clustering - MC Cluster Tree Elimination Mini-Clustering, i=3 A B C p(a), p(b|a), p(c|a,b) 1 BC B C D C D F p(d|b), h(1,2)(b,c) p(f|c,d) B C D F p(d|b), h(1,2)(b,c), p(f|c,d) B C D F p(d|b), p(f|c,d) 2 2 BF sep(2,3) = {B,F} elim(2,3) = {C,D} A B B E F p(e|b,f) 3 C D E EF F E F G p(g|e,f) 4 G

CTE vs. MC 1 1 2 2 3 3 4 4 Cluster Tree Elimination
ABC ABC 1 1 BC BC 2 BCDF 2 BCDF BF BF 3 BEF 3 BEF EF EF 4 EFG 4 EFG Cluster Tree Elimination Mini-Clustering, i=3

IJGP - Motivation IBP is applied to a loopy network iteratively
not an anytime algorithm when it converges, it converges very fast MC applies bounded inference along a tree decomposition MC is an anytime algorithm controlled by i-bound IJGP combines: the iterative feature of IBP the anytime feature of MC

IJGP - The basic idea Apply Cluster Tree Elimination to any join-graph
We commit to graphs that are minimal I-maps Avoid cycles as long as I-mapness is not violated Result: use minimal arc-labeled join-graphs

IJGP - Example A B C A C A ABC C A AB BC C D E ABDE BCE BE C DE CE F
CDEF G H F FGH H H F FG GH H I J GI FGI GHIJ a) Belief network a) The graph IBP works on

Arc-minimal join-graph
ABC C A ABC C A AB BC C AB BC ABDE BCE ABDE BCE BE C C DE CE DE CE CDEF CDEF H F FGH H FGH H H F F FG GH H FG GH GI GI FGI GHIJ FGI GHIJ

Minimal arc-labeled join-graph
ABC C A ABC C AB BC AB BC ABDE BCE ABDE BCE C C DE CE DE CE CDEF CDEF FGH H H FGH H H F F FG GH F GH GI GI FGI GHIJ FGI GHIJ

Join-graph decompositions
ABC C AB BC BC BC ABDE BCE ABCDE BCE ABCDE BCE C DE CE CDE CE DE CE CDEF CDEF CDEF FGH H H FGH FGH F F F F GH F GH F GH GI GI GI FGI GHIJ FGI GHIJ FGI GHIJ a) Minimal arc-labeled join graph b) Join-graph obtained by collapsing nodes of graph a) c) Minimal arc-labeled join graph

Tree decomposition BC ABCDE BCE ABCDE DE CE CDE CDEF CDEF FGH F F F GH
GI GHI FGI GHIJ FGHI GHIJ a) Minimal arc-labeled join graph a) Tree decomposition

Join-graphs more accuracy less complexity A ABDE FGI ABC BCE GHIJ CDEF
FGH C H AB BC BE DE CE F FG GH GI A ABDE FGI ABC BCE GHIJ CDEF FGH C H AB BC DE CE F GH GI ABCDE FGI BCE GHIJ CDEF FGH BC DE CE F GH GI ABCDE FGHI GHIJ CDEF CDE F GHI more accuracy less complexity

Minimal arc-labeled decomposition
BC BC ABCDE BCE ABCDE BCE CDE CE DE CE CDEF CDEF a) Fragment of an arc-labeled join-graph a) Shrinking labels to make it a minimal arc-labeled join-graph Use a DFS algorithm to eliminate cycles relative to each variable

p(a), p(c), p(b|ac), p(d|abe),p(e|b,c) h(3,1)(bc)
Message propagation ABCDE BC BCE ABCDE p(a), p(c), p(b|ac), p(d|abe),p(e|b,c) h(3,1)(bc) h(3,1)(bc) CDE 1 BCD 3 CE BC CDEF FGH h(1,2) CDE CE F F GH 2 CDEF GI FGI GHIJ Minimal arc-labeled: sep(1,2)={D,E} elim(1,2)={A,B,C} Non-minimal arc-labeled: sep(1,2)={C,D,E} elim(1,2)={A,B}

Bounded decompositions
We want arc-labeled decompositions such that: the cluster size (internal width) is bounded by i (the accuracy parameter) the width of the decomposition as a graph (external width) is as small as possible Possible approaches to build decompositions: partition-based algorithms - inspired by the mini-bucket decomposition grouping-based algorithms

Partition-based algorithms
F C D B A G: (GFE) E: (EBF) (EF) F: (FCD) (BF) D: (DB) (CD) C: (CAB) (CB) B: (BA) (AB) (B) A: (A) GFE P(G|F,E) EF EBF P(E|B,F) P(F|C,D) F BF FCD BF CD CDB P(D|B) CB B CAB P(C|A,B) BA BA P(B|A) A A P(A) a) schematic mini-bucket(i), i=3 b) arc-labeled join-graph decomposition

IJGP properties IJGP(i) applies BP to min arc-labeled join-graph, whose cluster size is bounded by i On join-trees IJGP finds exact beliefs IJGP is a Generalized Belief Propagation algorithm (Yedidia, Freeman, Weiss 2001) Complexity of one iteration: time: O(deg•(n+N) •d i+1) space: O(N•d)

Empirical evaluation Algorithms: Measures:
Exact IBP MC IJGP Measures: Absolute error Relative error Kullback-Leibler (KL) distance Bit Error Rate Time Networks (all variables are binary): Random networks Grid networks (MxM) CPCS 54, 360, 422 Coding networks

Random networks - KL at convergence
This is a comment evidence=0 evidence=5

Random networks - KL vs. iterations
evidence=0 evidence=5

Random networks - Time

Grid 81 – KL distance at convergence

Grid 81 – KL distance vs. iterations

Grid 81 – Time vs. i-bound

N=400, P=4, 500 instances, 30 iterations, w*=43
Coding networks N=400, P=4, 500 instances, 30 iterations, w*=43

Coding networks - BER sigma=.22 sigma=.32 sigma=.51 sigma=.65

Coding networks - Time

CPCS 422 – KL distance evidence=0 evidence=30

CPCS 422 – KL vs. iterations evidence=0 evidence=30

CPCS 422 – Relative error evidence=30

CPCS 422 – Time vs. i-bound evidence=0

Conclusion IJGP borrows the iterative feature from IBP and the anytime virtues of bounded inference from MC Empirical evaluation showed the potential of IJGP, which improves with iteration and most of the time with i-bound, and scales up to large networks IJGP is almost always superior, often by a high margin, to IBP and MC Based on all our experiments, we think that IJGP provides a practical breakthrough to the task of belief updating

Iterative Join Graph Propagation

Similar presentations

Presentation on theme: "Iterative Join Graph Propagation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Iterative Join Graph Propagation

Similar presentations

Presentation on theme: "Iterative Join Graph Propagation"— Presentation transcript:

Similar presentations

About project

Feedback