Compiling Bayesian Networks Using Variable Elimination Mark Chavira Adnan Darwiche UCLA
Outline Motivation Review ADD Compilation Experimental Results Conclusion
Probabilistic Inference Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start
Multiple Queries MAP Sensitivity Analysis State Estimation Diagnosis …
Multiple Queries Diabetes Munin4 4255 queries 1.8 seconds using jointree 6.7 minutes using VE Munin4 4580 queries 2.7s seconds using jointree 2.2 days using VE Munin4: 1041 variables with average cardinality 5.4, 3.3 weeks using ADDs Diabetes: 413 variables with average cardinality 11.3, 3.9 hours using ADDs Mildew: 35 variables with average cardinality 17.6 581 queries 0.6 seconds using jointree 4.2 seconds using tables 1.5 hours using ADDs
Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start
Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start Context Specific Independence (CSI)
Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power ON OFF OK WEAK DEAD Lights Battery Power .99 .01 .20 .80 1 Spark Plugs Gas Gauge If Battery Power = Dead, then Lights = OFF Determinism Radio Lights Engine Turn Over Engine Start
Previous Work: VE and Local Structure Bahar et al, 1993, ADDs Sanner and McAllester, 2005, Affine ADDs Poole and Zhang, 2003, Confactors Larkin and Dechter, 2003, Sparse Representations …
Overhead Network Tabular (ms) ADD (ms) Times Worse barley 307 14,049 45.8 bm-5-3 4,892 658 0.1 diabetes 949 33,220 35.0 hailfinder 48 515 10.7 link 1,688 2,658 1.6 mm-3-8-3 2,166 843 0.4 mildew 72 92,602 1286.1 munin1 155 1,255 8.1 munin2 204 3,170 15.5 munin3 350 5,049 14.4 munin4 406 4,361 pathfinder 51 5,213 102.2 pigs 69 597 8.7 st-3-2 186 362 1.9 water 76 1,015 13.4
Outline Motivation Review ADD Compilation Experimental Results Conclusion
Variable Elimination Elimination Order: A,B,C T1(A,B) T2(A,C) T3(B) T4(C) T5(A,B,C) T6(B,C) T7(B,C) T8(C) T9(C) T10()
Algebraic Decision Diagrams (ADDs) X Y Z f(.) x1 y1 z1 0.9 z2 0.1 y2 x2 0.5 X Z Y Z .1 .9 .5
Network MLF A B Set θx|u to Pr(x|u) Set λx to 1(0) iff x is (is not) compatible with e
Circuit Evaluation and Differentiation .3 + 1 .3 0 * * 1 1 + + 1 1 .3 .1 .9 .8 .2 0 * * * * * * 1 .3 .3 0 0 1 .3 1 .1 1 .9 .8 1 .2 0 .7 1 .3 .03 .27 .7
Outline Motivation Review ADD Compilation Experimental Results Conclusion
Goals Answer multiple queries Exploit local structure (w/o overhead) Use VE as a basis for compiling a BN Exploit local structure (w/o overhead) Use structured factors rather than tables
Symbolic ADDs * * X X Y Y Y Y .4 .6 .9 .1 Normal ADD Symbolic ADD y2 y2 .5 y1 Normal ADD Symbolic ADD
CPT Symbolic ADD X X Z Y Z Y Z Z .1 .9 .5 .1 .9 .5 X Y Z f(.) x1 y1 z1 0.9 z2 0.1 y2 x2 0.5 Z Y Z Y Z Z .1 .9 .5 .1 .9 .5
Indicator ADDs X x1 x2
Symbolic ADD Operations
Changing a Single Line = * .1 .9 .09 = * * * y2 * y2 .5 y1 .5 y1
Multiplication Example Y Y Y = Y * * * * * y1 y2 y1 1 y2 .5 y1 .5 y2 1 .5
Local Structure * * * * * * X X Y Y Y Y y2 y1 1 y2 .5 y1 .5 y2 .5 y2 y1 1 y2 .5 y1 .5 y2 .5 y1
Local Structure * c1 c2 c1*c2 * α * 1 α + c1 c2 c1+c2 + α α
How Does it Work? Generate CPT and Indicator ADDs Perform VE as normal Result is ADD sink labeled with pointer to compilation!
Other Details ADD Variable Order Unique Table for ADD nodes Construction of CPT ADDs Dealing with multi-valued variables
Example: Convert to Symbolic Order: Y,X X Pr(x) x1 0.1 x2 0.9 X X X X Y Y Y .1 .9 x1 x2 y1 y2 1 .5 X Y Pr(y|x) x1 y1 0.0 y2 1.0 x2 0.5 CPT ADD for X Indicator ADD for X Indicator ADD for Y CPT ADD for Y
Example: Multiply Y Order: Y,X X X X Y Y .1 .9 1 .5 CPT ADD for X 1 .5 CPT ADD for X Indicator ADD for X Indicator ADD for Y CPT ADD for Y
Example: Sum Out Y * * Order: Y,X X X X Y Y .1 .9 CPT ADD for X Indicator ADD for X y2 .5 y1
Example: Multiply X + * * Order: Y,X X X X .1 .9 CPT ADD for X Indicator ADD for X y1 .5 y2
Example: Multiply X * * + * * Order: Y,X X X .1 .9 CPT ADD for X x1 And then these two. x1 x2 + * * CPT ADD for X y1 .5 y2
Example: Sum Out X Order: Y,X X * * .1 * * .9 x1 + x2 * * y1 .5 y2
Example: Compilation Complete! Order: Y,X + * * .1 * * .9 x1 + x2 * * y1 .5 y2
Outline Motivation Review ADD Compilation Experimental Results Conclusion
Offline Inference Network Ace (s) ADD (s) Times Better barley 8190.2 122.8 66.7 bm-5-3 0.8 6.0 0.1 diabetes 1710.0 110.3 15.5 hailfinder 0.7 1.2 0.5 link – 699.7 mm-3-8-3 1.5 11.9 mildew 3125.2 218.9 14.3 munin1 1005.1 316.7 3.2 munin2 198.4 31.7 6.3 munin3 188.4 17.6 10.7 munin4 205.0 37.8 5.4 pathfinder 4.9 5.8 0.9 pigs 23.1 10.0 2.3 st-3-2 2.4 0.2 water 3.0 20.7
Online Inference Network JT (ms) ADD (ms) Times Better barley 65,226 35,209 1.9 bm-5-3 89,593 83 1079.4 diabetes 29,316 20,421 1.4 hailfinder 245 70 3.5 link 223,542 175,769 1.3 mm-3-8-3 34,001 198 171.7 mildew 10,077 4,522 2.2 munin1 669,915 37,451 17.9 munin2 17,857 7,180 2.5 munin3 13,351 4,945 2.7 munin4 42,754 8,683 4.9 pathfinder 1,332 102 13.1 pigs 3,020 2,814 1.1 st-3-2 17,536 82 213.9 water 16,676 251 66.4
Conclusion VE limitations: VE as a basis for compilation Multiple queries Local structure VE as a basis for compilation Solves the multiple query problem Solves the local structure (overhead) problem Orders of magnitude more efficient than jointree Makes body of research more effective Implications beyond VE
Previous Work: VE and Multiple Queries Cozman, 2000 Darwiche, 2000 There has been a limited amount of work to enhance VE to answer multiple queries simultaneously as does the jointree algorthm. For example, we might wish to compute probability of evidence and a posterior marginal on each variable simultaneously. But this work hasn’t improve performance over jointree. We’ll see over the course of this talk, that we can solve the multiple query problem while at the same time dramatically outperforming jointree local structure exists in the network.
Global Structure
Compiling Bayesian Networks P(e) = λa1λb1λc1θa1θb1θc1|a1 b1 + λa1λb1λc2θa1θb1θc2|a1 b1 + λa1λb1λc3θa1θb1θc3|a1 b1 + λa1λb2λc1θa1θb2θc1|a1 b2 + ... λa2λb2λc3θa2θb2θc3|a2 b2