Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

A Tutorial on Learning with Bayesian Networks
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Exact Inference in Bayes Nets
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Review: Bayesian learning and inference
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Bayesian Networks Material used 1 Random variables
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Read R&N Ch Next lecture: Read R&N
Bayesian networks Chapter 14. Outline Syntax Semantics.
A Brief Introduction to Graphical Models
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
Introduction to Bayesian Networks
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
A Brief Introduction to Bayesian networks
Another look at Bayesian inference
CS479/679 Pattern Recognition Dr. George Bebis
Reasoning Under Uncertainty: Belief Networks
CS 2750: Machine Learning Directed Graphical Models
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Qian Liu CSE spring University of Pennsylvania
Read R&N Ch Next lecture: Read R&N
Learning Bayesian Network Models from Data
Read R&N Ch Next lecture: Read R&N
Pattern Recognition and Image Analysis
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence Fall 2007
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
Read R&N Ch Next lecture: Read R&N
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Read R&N Ch Next lecture: Read R&N
Presentation transcript:

Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis

Statistical Dependence Between Variables – Representing high-dimensional densities is very challenging since we need to estimate many parameters (e.g., k n ) Many times, the only knowledge we have about a distribution is which variables are (or are not) dependent. Such dependencies can be represented efficiently using Bayesian Networks (or Belief Networks).

Example of Dependencies Represent the state of an automobile: – Engine temperature – Brake fluid pressure – Tire air pressure – Wire voltages Causally related variables – Engine temperature – Coolant temperature NOT causally related variables – Engine oil pressure – Tire air pressure

Bayesian Net Applications Microsoft: Answer Wizard, Print Troubleshooter US Army: SAIP (Battalion Detection from SAR, IR etc.) NASA: Vista (DSS for Space Shuttle) GE: Gems (real-time monitoring of utility generators)

Definitions and Notation A bayesian net is usually a Directed Acyclic Graph (DAG) Each node represents a variable. Each variable assumes certain states (i.e., values).

Relationships Between Nodes A link joining two nodes is directional and represents a causal influence (e.g., A influences X or X depends on A) Influences could be direct or indirect (e.g., A influences X directly and A influences C indirectly through X).

Prior / Conditional Probabilities Each variable is associated with prior or conditional probabilities (discrete or continuous).

Markov Property “Each node is conditionally independent of its ancestors given its parents” Example: parents of x 1

Computing Joint Probabilities Using the Markov property Using the chain rule, the joint probability of a set of variables x 1, x 2, …, x n is given as: Using the Markov property (i.e., node x i is conditionally independent of its ancestors given its parents π i ), we have : = much simpler!

Example We can compute the probability of any configuration of states in the joint density, e.g.: P(a 3, b 1, x 2, c 3, d 2 )=P(a 3 )P(b 1 )P(x 2 /a 3,b 1 )P(c 3 /x 2 )P(d 2 /x 2 )= 0.25 x 0.6 x 0.4 x 0.5 x 0.4 = 0.012

Fundamental Problems in Bayesian Nets Evaluation (inference): Given the values of the observed variables (evidence), estimate the values of the non-observed variables. Learning: Given training data and prior information (e.g., expert knowledge, causal relationships), estimate the network structure, or the parameters (probabilities), or both.

Inference Example: Medical Diagnosis Uppermost nodes: biological agents (bacteria, virus) Intermediate nodes: diseases Lowermost nodes: symptoms Goal: given some evidence (biological agents, symptoms), find most likely disease. causes effects

Evaluation (Inference) Problem In general, if X denotes the query variables and e denotes the evidence, then where α=1/P(e) is a constant of proportionality.

Example Classify a fish given that the fish is light (c 1 ) and was caught in south Atlantic (b 2 ) -- no evidence about what time of the year the fish was caught nor its thickness.

Example (cont’d)

Similarly, P(x 2 / c 1,b 2 )=α Normalize probabilities (not needed necessarily): P(x 1 /c 1,b 2 )+ P(x 2 /c 1,b 2 )=1 (α=1/0.18) P(x 1 /c 1,b 2 )= 0.73 P(x 2 /c 1,b 2 )= 0.27 salmon

Evaluation (Inference) Problem (cont’d) Exact inference is an NP-hard problem because the number of terms in the summations (or integrals) for discrete (or continuous) variables grows exponentially with the number of variables. For some restricted classes of networks (e.g., singly connected networks where there is no more than one path between any two nodes) exact inference can be efficiently solved in time linear in the number of nodes.

Evaluation (Inference) Problem (cont’d) For singly connected Bayesian networks: Approximate inference methods are typically used in most cases. – Sampling (Monte Carlo) methods – Variational methods – Loopy belief propagation

Another Example You have a new burglar alarm installed at home. It is fairly reliable at detecting burglary, but also sometimes responds to minor earthquakes. You have two neighbors, Ali and Veli, who promised to call you at work when they hear the alarm.

Another Example (cont’d) Ali always calls when he hears the alarm, but sometimes confuses telephone ringing with the alarm and calls too. Veli likes loud music and sometimes misses the alarm. Design a Bayesian network to estimate the probability of a burglary given some evidence.

Another Example (cont’d) What are the system variables? – Alarm – Causes Burglary, Earthquake – Effects Ali calls, Veli calls

Another Example (cont’d) What are the conditional dependencies among them? – Burglary (B) and earthquake (E) directly affect the probability of the alarm (A) going off – Whether or not Ali calls (AC) or Veli calls (VC) depends on the alarm.

Another Example (cont’d)

What is the probability that the alarm has sounded but neither a burglary nor an earthquake has occurred, and both Ali and Veli call?

Another Example (cont’d) What is the probability that there is a burglary given that Ali calls? What about if both Veli and Ali call?

Naïve Bayesian Network Assuming that features are conditionally independent, the conditional class density can be simplified as follows: Sometimes works well in practice despite the strong assumption behind it. Naïve Bayesian Network: