Compiling Bayesian Networks Using Variable Elimination

Slides:



Advertisements
Similar presentations
Autonomic Scaling of Cloud Computing Resources
Advertisements

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
A. Darwiche Sensitivity Analysis in Bayesian Networks Adnan Darwiche Computer Science Department
Bayesian Networks. Quiz: Probabilistic Reasoning 1.What is P(F), the probability that some creature can fly? 2.Creature b is a bumble bee. What’s P(F|B),
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
A. Darwiche Bayesian Networks. A. Darwiche Reasoning Systems Diagnostics: Which component failed? Information retrieval: What document to retrieve? On-line.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
MURI Progress Report, June 2001 Advances in Approximate and Hybrid Reasoning for Decision Making Under Uncertainty Rina Dechter UC- Irvine Collaborators:
BAYESIAN NETWORKS CHAPTER#4 Book: Modeling and Reasoning with Bayesian Networks Author : Adnan Darwiche Publisher: CambridgeUniversity Press 2009.
Recursive Conditioning Adnan Darwiche Computer Science Department UCLA.
From Rumelhart (1977) "Toward Interactive Model of Reading"
A. Darwiche Inference in Bayesian Networks. A. Darwiche Query Types Pr: –Evidence: Pr(e) –Posterior marginals: Pr(x|e) for every X MPE: Most probable.
TTDD: A Two-tier Data Dissemination Model for Large- scale Wireless Sensor Networks Haiyun Luo Fan Ye, Jerry Cheng Songwu Lu, Lixia Zhang UCLA CS Dept.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graph.
A. Darwiche Bayesian Networks. A. Darwiche Bayesian Network Battery Age Alternator Fan Belt Battery Charge Delivered Battery Power Starter Radio LightsEngine.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
CS 188: Artificial Intelligence Spring 2007 Lecture 14: Bayes Nets III 3/1/2007 Srini Narayanan – ICSI and UC Berkeley.
Evaluation of Bayesian Networks Used for Diagnostics[1]
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Context-specific independence Graphical Models – Carlos Guestrin Carnegie Mellon University October 16 th, 2006 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.
Electric Current 19-1 page 694. Current and charge movement Electricity did not become an integral part of our lives until scientists learned how to control.
Performing Bayesian Inference by Weighted Model Counting Tian Sang, Paul Beame, and Henry Kautz Department of Computer Science & Engineering University.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Belief Propagation Revisited Adnan Darwiche. Graphical Models Battery Age Alternator Fan Belt Battery Charge Delivered Battery Power Starter Radio LightsEngine.
Made by: Maor Levy, Temple University  Inference in Bayes Nets ◦ What is the probability of getting a strong letter? ◦ We want to compute the.
Variable and Value Ordering for MPE Search Sajjad Siddiqi and Jinbo Huang.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Bayesian Networks CSE 473. © D. Weld and D. Fox 2 Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Scalable Probabilistic Databases with Factor Graphs and MCMC Michael Wick, Andrew McCallum, and Gerome Miklau VLDB 2010.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.
Jeopardy Solving Equations
§ 1.7 More About Derivatives.
CS 2750: Machine Learning Directed Graphical Models
Qian Liu CSE spring University of Pennsylvania
Inference in Bayesian Networks
Today.
Solving MAP Exactly by Searching on Compiled Arithmetic Circuits
A New Algorithm for Computing Upper Bounds for Functional EmajSAT
Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule
Quizzz Rihanna’s car engine does not start (E).
Series and Parallel Circuits
Foundations of Physics
20.1 Series and Parallel Circuits
Encoding CNFs to Enhance Component Analysis
Instructors: Fei Fang (This Lecture) and Dave Touretzky
CAP 5636 – Advanced Artificial Intelligence
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2008
Class #16 – Tuesday, October 26
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Context-specific independence
Probabilistic Reasoning
Variable Elimination Graphical Models – Carlos Guestrin
6.2 Multiplying Powers with the Same Base
Presentation transcript:

Compiling Bayesian Networks Using Variable Elimination Mark Chavira Adnan Darwiche UCLA

Outline Motivation Review ADD Compilation Experimental Results Conclusion

Probabilistic Inference Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start

Multiple Queries MAP Sensitivity Analysis State Estimation Diagnosis …

Multiple Queries Diabetes Munin4 4255 queries 1.8 seconds using jointree 6.7 minutes using VE Munin4 4580 queries 2.7s seconds using jointree 2.2 days using VE Munin4: 1041 variables with average cardinality 5.4, 3.3 weeks using ADDs Diabetes: 413 variables with average cardinality 11.3, 3.9 hours using ADDs Mildew: 35 variables with average cardinality 17.6 581 queries 0.6 seconds using jointree 4.2 seconds using tables 1.5 hours using ADDs

Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start

Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power Spark Plugs Gas Gauge Radio Lights Engine Turn Over Engine Start Context Specific Independence (CSI)

Local Structure: CSI and Determinism Battery Age Alternator Fan Belt Charge Delivered Battery Fuel Pump Fuel Line Starter Gas Distributor Battery Power ON OFF OK WEAK DEAD Lights Battery Power .99 .01 .20 .80 1 Spark Plugs Gas Gauge If Battery Power = Dead, then Lights = OFF Determinism Radio Lights Engine Turn Over Engine Start

Previous Work: VE and Local Structure Bahar et al, 1993, ADDs Sanner and McAllester, 2005, Affine ADDs Poole and Zhang, 2003, Confactors Larkin and Dechter, 2003, Sparse Representations …

Overhead Network Tabular (ms) ADD (ms) Times Worse barley 307 14,049 45.8 bm-5-3 4,892 658 0.1 diabetes 949 33,220 35.0 hailfinder 48 515 10.7 link 1,688 2,658 1.6 mm-3-8-3 2,166 843 0.4 mildew 72 92,602 1286.1 munin1 155 1,255 8.1 munin2 204 3,170 15.5 munin3 350 5,049 14.4 munin4 406 4,361 pathfinder 51 5,213 102.2 pigs 69 597 8.7 st-3-2 186 362 1.9 water 76 1,015 13.4

Outline Motivation Review ADD Compilation Experimental Results Conclusion

Variable Elimination Elimination Order: A,B,C T1(A,B) T2(A,C) T3(B) T4(C) T5(A,B,C) T6(B,C) T7(B,C) T8(C) T9(C) T10()

Algebraic Decision Diagrams (ADDs) X Y Z f(.) x1 y1 z1 0.9 z2 0.1 y2 x2 0.5 X Z Y Z .1 .9 .5

Network MLF A B Set θx|u to Pr(x|u) Set λx to 1(0) iff x is (is not) compatible with e

Circuit Evaluation and Differentiation .3 + 1 .3 0 * * 1 1 + + 1 1 .3 .1 .9 .8 .2 0 * * * * * * 1 .3 .3 0 0 1 .3 1 .1 1 .9 .8 1 .2 0 .7 1 .3 .03 .27 .7

Outline Motivation Review ADD Compilation Experimental Results Conclusion

Goals Answer multiple queries Exploit local structure (w/o overhead) Use VE as a basis for compiling a BN Exploit local structure (w/o overhead) Use structured factors rather than tables

Symbolic ADDs * * X X Y Y Y Y .4 .6 .9 .1 Normal ADD Symbolic ADD y2 y2 .5 y1 Normal ADD Symbolic ADD

CPT Symbolic ADD X X Z Y Z Y Z Z .1 .9 .5 .1 .9 .5 X Y Z f(.) x1 y1 z1 0.9 z2 0.1 y2 x2 0.5 Z Y Z Y Z Z .1 .9 .5 .1 .9 .5

Indicator ADDs X x1 x2

Symbolic ADD Operations

Changing a Single Line = * .1 .9 .09 = * * * y2 * y2 .5 y1 .5 y1

Multiplication Example Y Y Y = Y * * * * * y1 y2 y1 1 y2 .5 y1 .5 y2 1 .5

Local Structure * * * * * * X X Y Y Y Y y2 y1 1 y2 .5 y1 .5 y2 .5 y2 y1 1 y2 .5 y1 .5 y2 .5 y1

Local Structure * c1 c2 c1*c2 * α * 1 α + c1 c2 c1+c2 + α α

How Does it Work? Generate CPT and Indicator ADDs Perform VE as normal Result is ADD sink labeled with pointer to compilation!

Other Details ADD Variable Order Unique Table for ADD nodes Construction of CPT ADDs Dealing with multi-valued variables

Example: Convert to Symbolic Order: Y,X X Pr(x) x1 0.1 x2 0.9 X X X X Y Y Y .1 .9 x1 x2 y1 y2 1 .5 X Y Pr(y|x) x1 y1 0.0 y2 1.0 x2 0.5 CPT ADD for X Indicator ADD for X Indicator ADD for Y CPT ADD for Y

Example: Multiply Y Order: Y,X X X X Y Y .1 .9 1 .5 CPT ADD for X 1 .5 CPT ADD for X Indicator ADD for X Indicator ADD for Y CPT ADD for Y

Example: Sum Out Y * * Order: Y,X X X X Y Y .1 .9 CPT ADD for X Indicator ADD for X y2 .5 y1

Example: Multiply X + * * Order: Y,X X X X .1 .9 CPT ADD for X Indicator ADD for X y1 .5 y2

Example: Multiply X * * + * * Order: Y,X X X .1 .9 CPT ADD for X x1 And then these two. x1 x2 + * * CPT ADD for X y1 .5 y2

Example: Sum Out X Order: Y,X X * * .1 * * .9 x1 + x2 * * y1 .5 y2

Example: Compilation Complete! Order: Y,X + * * .1 * * .9 x1 + x2 * * y1 .5 y2

Outline Motivation Review ADD Compilation Experimental Results Conclusion

Offline Inference Network Ace (s) ADD (s) Times Better barley 8190.2 122.8 66.7 bm-5-3 0.8 6.0 0.1 diabetes 1710.0 110.3 15.5 hailfinder 0.7 1.2 0.5 link – 699.7 mm-3-8-3 1.5 11.9 mildew 3125.2 218.9 14.3 munin1 1005.1 316.7 3.2 munin2 198.4 31.7 6.3 munin3 188.4 17.6 10.7 munin4 205.0 37.8 5.4 pathfinder 4.9 5.8 0.9 pigs 23.1 10.0 2.3 st-3-2 2.4 0.2 water 3.0 20.7

Online Inference Network JT (ms) ADD (ms) Times Better barley 65,226 35,209 1.9 bm-5-3 89,593 83 1079.4 diabetes 29,316 20,421 1.4 hailfinder 245 70 3.5 link 223,542 175,769 1.3 mm-3-8-3 34,001 198 171.7 mildew 10,077 4,522 2.2 munin1 669,915 37,451 17.9 munin2 17,857 7,180 2.5 munin3 13,351 4,945 2.7 munin4 42,754 8,683 4.9 pathfinder 1,332 102 13.1 pigs 3,020 2,814 1.1 st-3-2 17,536 82 213.9 water 16,676 251 66.4

Conclusion VE limitations: VE as a basis for compilation Multiple queries Local structure VE as a basis for compilation Solves the multiple query problem Solves the local structure (overhead) problem Orders of magnitude more efficient than jointree Makes body of research more effective Implications beyond VE

Previous Work: VE and Multiple Queries Cozman, 2000 Darwiche, 2000 There has been a limited amount of work to enhance VE to answer multiple queries simultaneously as does the jointree algorthm. For example, we might wish to compute probability of evidence and a posterior marginal on each variable simultaneously. But this work hasn’t improve performance over jointree. We’ll see over the course of this talk, that we can solve the multiple query problem while at the same time dramatically outperforming jointree local structure exists in the network.

Global Structure

Compiling Bayesian Networks P(e) = λa1λb1λc1θa1θb1θc1|a1 b1 + λa1λb1λc2θa1θb1θc2|a1 b1 + λa1λb1λc3θa1θb1θc3|a1 b1 + λa1λb2λc1θa1θb2θc1|a1 b2 + ... λa2λb2λc3θa2θb2θc3|a2 b2