S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE Master Universitario en Inteligencia Artificial Concha Bielza, Pedro Larrañaga Computational.

Slides:



Advertisements
Similar presentations
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Jan, 29, 2014.
Advertisements

A Tutorial on Learning with Bayesian Networks
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,
. Exact Inference in Bayesian Networks Lecture 9.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Lauritzen-Spiegelhalter Algorithm
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering A Comparison of Lauritzen- Spiegelhalter, Hugin, and Shenoy- Shafer Architectures.
Dynamic Bayesian Networks (DBNs)
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
From Variable Elimination to Junction Trees
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Bayesian network inference
Inference in Bayesian Nets
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.
Exact Inference: Clique Trees
Bayesian Networks Alan Ritter.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Computer vision: models, learning and inference
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
Introduction to Bayesian Networks
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Announcements Project 4: Ghostbusters Homework 7
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.
Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.
Bayes network inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y 
Today Graphical Models Representing conditional dependence graphically
Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Inference in Bayesian Networks
Read R&N Ch Next lecture: Read R&N
Read R&N Ch Next lecture: Read R&N
CSCI 5822 Probabilistic Models of Human and Machine Learning
CAP 5636 – Advanced Artificial Intelligence
Professor Marie desJardins,
Bayesian Statistics and Belief Networks
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
Variable Elimination 2 Clique Trees
Lecture 3: Exact Inference in GMs
Read R&N Ch Next lecture: Read R&N
Variable Elimination Graphical Models – Carlos Guestrin
Presentation transcript:

S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE Master Universitario en Inteligencia Artificial Concha Bielza, Pedro Larrañaga Computational Intelligence Group Departamento de Inteligencia Artificial Universidad Politécnica de Madrid

C.Bielza, P.Larrañaga -UPM- 2 Types of queries Brute-force computation Probabilistic logic sampling Variable elimination algorithm Message passing algorithm Conceptos básicos Inference in Bayesian networks Exact inference: Approximate inference:

C.Bielza, P.Larrañaga -UPM- 3 Queries: posterior probabilities Given some evidence e (observations), Posterior probability of a target variable(s) X : Other names: probability propagation, belief updating or revision… Alarm Earth. Burgl. WCalls News ? Vector Types of queries QueriesBrute-force VE Message Approx answer queries about P

C.Bielza, P.Larrañaga -UPM- 4 Semantically, for any kind of reasoning Predictive reasoning or deductive (causal inference): predict effects Alarm Earth. Burgl. WCalls News ? Diagnostic reasoning (diagnostic inference): diagnose the causes Alarm Earth. Burgl. WCalls News ? Symptoms|Disease Disease|Symptoms Types of queries QueriesBrute-force VE Message Approx Target variable is usually a descendant of the evidence Target variable is usually an ancestor of the evidence

C.Bielza, P.Larrañaga -UPM- 5 More queries: maximum a posteriori (MAP) Most likely configurations (abductive inference): event that best explains the evidence Total abduction: search for Partial abduction: search for K most likely explanations subset. of unobserved (explanation set) all the unobserved Alarm Earth.Burgl. WCalls News ? ? Alarm Earth.Burgl. WCalls News ? ? ? ? Types of queries QueriesBrute-force VE Message Approx In general, cannot be computed component-wise, with max P(x i |e)

C.Bielza, P.Larrañaga -UPM- 6 More queries: maximum a posteriori (MAP) Types of queries QueriesBrute-force VE Message Approx Use MAP for: Classification: find most likely label, given the evidence Explanation: what is the most likely scenario, given the evidence

C.Bielza, P.Larrañaga -UPM- 7 More queries: decision-making Optimal decisions (of maximum expected utility), with influence diagrams Types of queries QueriesBrute-force VE Message Approx

C.Bielza, P.Larrañaga -UPM- 8 Brute-force computation of P(X|e) First, consider P(X i ), without observed evidence e. Conceptually simple but computationally complex For a BN with n variables, each with its P(X j |Pa(X j )): But this amounts to computing the JPD, often very inefficient and even intractable computationally CHALLENGE: Without computing the JDP, exploit the factorization encoded by the BN and the distributive law (local computations) Exact inference [Pearl’88; Lauritzen & Spiegelhalter’88] QueriesBrute-force VE Message Approx Brute-force approach

C.Bielza, P.Larrañaga -UPM- 9 Improving brute-force Use the JPD factorization and the distributive law Table with 32 inputs (JPD) (if binary variables) Exact inference QueriesBrute-force VE Message Approx ?

C.Bielza, P.Larrañaga -UPM- 10 Improving brute-force Arrange computations effectively, moving some additions  over X 5 and X 3 :  over X 4 : Biggest table with 8 (like the BN) Exact inference QueriesBrute-force VE Message Approx

C.Bielza, P.Larrañaga -UPM- 11 Variable elimination algorithm Wanted: A list with all functions of the problem Select an elimination order  of all variables (except i) For each X k from , if F is the set of functions that involve X k : Delete F from the list Add f’ to the list Output: combination (multiplication) of all functions in the current list Eliminate X k = combine all the functions that contain this variable and marginalize out X k Compute ONE variable Exact inference QueriesBrute-force VE Message Approx

C.Bielza, P.Larrañaga -UPM- 12 Variable elimination algorithm Exact inference QueriesBrute-force VE Message Approx Repeat the algorithm for each target variable

C.Bielza, P.Larrañaga -UPM- 13 Example with Asia network Exact inference QueriesBrute-force VE Message Approx Visit to Asia (A) Smoking (S) Lung Cancer (L) Tuberculosis (T) Tub. or Lung Canc (E) Bronchitis (B) X-Ray (X) Dyspnea (D)

C.Bielza, P.Larrañaga -UPM- 14 Brute-force approach Compute P(D) by brute-force: Exact inference QueriesBrute-force VE Message Approx Complexity is exponential in the size of the graph (number of variables *number of states for each variable)

C.Bielza, P.Larrañaga -UPM- 15 Exact inference QueriesBrute-force VE Message Approx not necessarily a probability term

C.Bielza, P.Larrañaga -UPM- 16 Exact inference QueriesBrute-force VE Message Approx 4

C.Bielza, P.Larrañaga -UPM- 17 Variable elimination algorithm Size = 8 Local computations (due to moving the additions) Importance of the elimination ordering, but finding an optimal (minimum cost) is NP-hard [Arnborg et al.’87] (heuristics for good sequences) Exact inference QueriesBrute-force VE Message Approx Complexity is exponential in the max N. of var. in factors of the summation

C.Bielza, P.Larrañaga -UPM- 18 Message passing algorithm Operates passing messages among the nodes of the network. Nodes act as processors that receive, calculate and send information. Called propagation algorithms Exact inference QueriesBrute-force VE Message Approx Clique tree propagation, based on the same principle as VE but with a sophisticated caching strategy that: Enables to compute the posterior prob. distr. of all variables in twice the time it takes to compute that of one single variable Works in an intuitive appealing fashion, namely message propagation

C.Bielza, P.Larrañaga -UPM- 19 Basic operations for a node Ask info(i,j) : Target node i asks info to node j. Does it for all neighbors j. They do the same until there are no nodes to ask Exact inference QueriesBrute-force VE Message Approx Send-message(i,j) : Each node sends a message to the node that asked him the info… until reaching the target node A message is defined over the intersection of domains of f i and f j. It is computed as: And finally, we calculate locally at each node i: Target combines all received info with his info and marginalize over the target variable

C.Bielza, P.Larrañaga -UPM- 20 Procedure for X 2 Exact inference QueriesBrute-force VE Message Approx CollectEvidence Ask

C.Bielza, P.Larrañaga -UPM- 21 P(X 2 ) as a message passing algorithm Exact inference QueriesBrute-force VE Message Approx ?

C.Bielza, P.Larrañaga -UPM- 22 VE as a message passing algorithm Direct correspondence: Exact inference QueriesBrute-force VE Message Approx ? VE Mess.

C.Bielza, P.Larrañaga -UPM- 23 Computing prob. P(X i |e) of all (unobserved) variables i at a time We can perform the previous process for each node: but many messages are repeated! Exact inference QueriesBrute-force VE Message Approx Or, we can use 2 rounds of messages as follows: Select a node as a root (or pivot) Ask or collect evidence from the leaves toward the root (messages in downward direction). As VE. Distribute evidence from the root toward the leaves (messages in upward direction) Calculate marginal distributions at each node by local computation, i.e. using its incoming messages This algorithm never constructs tables larger than those in the BN

C.Bielza, P.Larrañaga -UPM- 24 Message passing algorithm CollectEvidence Root node Exact inference QueriesBrute-force VE Message Approx First sweep: DistributeEvidence Second sweep:

C.Bielza, P.Larrañaga -UPM- 25 Networks with loops If net is not a polytree, it does not work Independence assumptions applied in the algorithm cannot be used here (now “any node separates the graph into 2 unconnected parts (polytrees)” does not hold) Exact inference QueriesBrute-force VE Message Approx Request/messages go in a cycle indefinitely (info goes through 2 paths and is counted twice) Alternatives??

C.Bielza, P.Larrañaga -UPM- 26 Complexity Exact inference QueriesBrute-force VE Message Approx Complexity of propagation algorithms in polytrees (i.e., without loops, cycles in the underlying undirected graph) is linear in the size (nodes+arcs) of the network [brute-force is exponential] Exact inference in multiply-connected BNs is an NP-complete problem [Cooper 1990]

C.Bielza, P.Larrañaga -UPM- 27 Alternative: clustering methods [Lauritzen & Spiegelhalter’88] Method implemented in the main BN software packages Transform the BN into a probabilistically equivalent polytree by merging nodes, removing the multiple paths between two nodes Exact inference QueriesBrute-force VE Message Approx M SB CH Metastatic cancer (M) is a possible cause of brain tumors (B) and an explanation for increased total serum calcium (S). In turn, either of these could explain a patient falling into a coma (C). Severe headache (H) is also associated with brain tumors. Create a new node Z, that combines S and B M Z=S,B CH States of Z: {tt,ft,tf,ff} P(Z|M)=P(S|M)P(B|M) since they are c.i. given M P(H|Z)=P(H|B) since H c.i. of S given B

C.Bielza, P.Larrañaga -UPM- 28 Alternative: clustering methods Steps for the JUNCTION TREE CLUSTERING ALGORITHM : 1.Moralize the BN 2.Triangulate the moral graph and obtain the cliques 3.Create the junction tree and its separators 4.Compute new parameters 5.Message passing algorithm Exact inference QueriesBrute-force VE Message Approx Transform BN into a polytree (slow, much memory if dense, but only once) Belief updating (fast) COMPILATION

C.Bielza, P.Larrañaga -UPM- 29 Inferencia aproximada Why? Because exact inference is intractable (NP-complete) with large (+40) and densely connected BNs Both deterministic and stochastic simulation to find approximate answers the associated cliques for the junction tree algorithm or the intermediate factors in the VE algorithm will grow in size, generating an exponential blowup in the number of computations performed Approximate inference QueriesBrute-force VE Message Approx

C.Bielza, P.Larrañaga -UPM- 30 Stochastic simulation Uses the network to generate a large number of cases (full instantiations) from the network distribution Inferencia aproximada Approximate inference QueriesBrute-force VE Message Approx P(X i |e) is estimated using these cases by counting observed frequencies in the samples. By the Law of Large Numbers, estimate converges to the exact probability as more cases are generated Approximate propagation in BNs within an arbitrary tolerance or accuracy is an NP-complete problem In practice, if e is not too unlikely, convergence is quickly

C.Bielza, P.Larrañaga -UPM- 31 Probabilistic logic sampling [Henrion’88] When all the nodes have been visited, we have a case, an instantiation of all the nodes in the BN A forward sampling algorithm Given an ancestral ordering of the nodes (parents before children), generate from X once we have generated from its parents (i.e. from the root nodes down to the leaves) Inferencia aproximada Approximate inference QueriesBrute-force VE Message Approx Repeat and use the observed frequencies to estimate P(X i |e) Use conditional prob. given the known values of the parents

C.Bielza, P.Larrañaga -UPM- 32 Software

C.Bielza, P.Larrañaga -UPM- 33 Software

C.Bielza, P.Larrañaga -UPM- 34 Software

C.Bielza, P.Larrañaga -UPM- 35 genie.sis.pitt.edu Software

C.Bielza, P.Larrañaga -UPM- 36 http.cs.berkeley.edu/~murphyk/ Software

C.Bielza, P.Larrañaga -UPM- 37 leo.ugr.es/elvira Software

C.Bielza, P.Larrañaga -UPM- S3-SEMINAR ON DATA MINING -BAYESIAN NETWORKS- B. INFERENCE Master Universitario en Inteligencia Artificial Concha Bielza, Pedro Larrañaga Computational Intelligence Group Departamento de Inteligencia Artificial Universidad Politécnica de Madrid