On the Power of Belief Propagation: A Constraint Propagation Perspective Rina Dechter Bozhena Bidyuk Robert Mateescu Emma Rollon
Distributed Belief Propagation
How many people?
Distributed Belief Propagation Causal support Diagnostic support
Belief Propagation in Polytrees
BP on Loopy Graphs Pearl (1988): use of BP to loopy networks McEliece, et. Al 1988: IBP’s success on coding networks Lots of research into convergence … and accuracy (?), but: Why IBP works well for coding networks Can we characterize other good problem classes Can we have any guarantees on accuracy (even if converges)
Arc-consistency Sound Incomplete Always converges (polynomial) AB CD A B D C < < < = A < B A < D D < C B = C
A ABABACAC ABDBCF DFG B B D F A A A C 1 AP(A) …0 ABP(B|A) ……0 ABDP(D|A,B) ………0 DFGP(G|D,F) ………0 BCFP(F|B,C) ………0 ACP(C|A) ……0 A AB ABD DFG BCF AC A ABABACAC ABDBCF DFG B B D F A A A C 1 Belief network Flat constraint network Flattening the Bayesian Network
ABP(B|A) 12> ……0 AB Ah 1 2 (A) 1>0 2 3 …0 Bh 1 2 (B) 1>0 2 3 …0 Bh 1 2 (B) 1>0 3 …0 A B B 1 3 Updated belief: Updated relation: AB Bel (A,B) 13> ……0 AB A ABAC ABDBCF DFG B B D F A A A C 1 Belief Zero Propagation = Arc-Consistency
AP(A) …0 ACP(C|A) ……0 ABP(B|A) ……0 BCFP(F|B,C) ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 Flat Network - Example
AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) > ……0 BCFP(F|B,C) ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 1
AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 2
AP(A) 1>0 3 …0 ACP(C|A) ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) ………0 DFGP(G|D,F) 2131 ………0 A ABABACAC ABDBCF DFG B B D F A A A C 1 IBP Example – Iteration 3
AP(A) 11 …0 ACP(C|A) ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) 1321 ………0 DFGP(G|D,F) 2131 ………0 IBP Example – Iteration 4 A ABABACAC ABDBCF DFG B B D F A A A C 1
AP(A) 11 …0 ACP(C|A) 121 ……0 ABP(B|A) 131 ……0 BCFP(F|B,C) 3211 ………0 ABDP(D|A,B) 1321 ………0 DFGP(G|D,F) 2131 ………0 ABCDFGBelief ………………0 IBP Example – Iteration 5 A ABABACAC ABDBCF DFG B B D F A A A C 1
IBP – Inference Power for Zero Beliefs Theorem: Iterative BP performs arc-consistency on the flat network. Soundness: Inference of zero beliefs by IBP converges All the inferred zero beliefs are correct Incompleteness: Iterative BP is as weak and as strong as arc-consistency Continuity Hypothesis: IBP is sound for zero - IBP is accurate for extreme beliefs? Tested empirically
Experimental Results Algorithms: IBP IJGP Measures: Exact/IJGP histogram Recall absolute error Precision absolute error Network types: Coding Linkage analysis* Grids* Two-layer noisy-OR* CPCS54, CPCS360 We investigated empirically if the results for zero beliefs extend to ε-small beliefs (ε > 0) * Instances from the UAI08 competition Have determinism? YES NO
Networks with Determinism: Coding N=200, 1000 instances, w*=15
Nets w/o Determinism: bn2o w* = 24 w* = 27 w* = 26
Nets with Determinism: Linkage Percentage Absolute Error pedigree1, w* = 21 Exact Histogram IJGP Histogram Recall Abs. Error Precision Abs. Error Percentage Absolute Error pedigree37, w* = 30 i-bound = 3i-bound = 7
Nets with Determinism: Grids (size, % determinism) =, w* = 22
Nets w/o Determinism: cpcs CPCS360: 5 instances, w*=20 CPCS54: 100 instances, w*=15
The Cutset Phenomena & irrelevant nodes Observed variables break the flow of inference IBP is exact when evidence variables form a cycle-cutset Unobserved variables without observed descendents send zero- information to the parent variables – it is irrelevant In a network without evidence, IBP converges in one iteration top-down X X Y
Nodes with extreme support Observed variables with xtreme priors or xtreme support can nearly-cut information flow: D B1B1 EDED A B2B2 EBEB ECEC EBEB ECEC C1C1 C2C2 Average Error vs. Priors
Conclusion: For Networks with Determinism IBP converges & sound for zero beliefs IBP’s power to infer zeros is as weak or as strong as arc-consistency However: inference of extreme beliefs can be wrong. Cutset property (Bidyuk and Dechter, 2000): Evidence and inferred singleton act like cutset If zeros are cycle-cutset, all beliefs are exact Extensions to epsilon-cutset were supported empirically.
Inference on Trees is Easy and Distributed Belief updating (sum-prod) MPE (max-prod) CSP – consistency (projection-join) #CSP (sum-prod) P(X) P(Y|X)P(Z|X) P(T|Y) P(R|Y)P(L|Z)P(M|Z) Trees are processed in linear time and memory Also Acyclic graphical models
Inference on Poly-Trees is Easy and Distributed