Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
1 Bayes nets Computing conditional probability Polytrees Probability Inferences Bayes nets Computing conditional probability Polytrees Probability Inferences.
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
3/19. Conditional Independence Assertions We write X || Y | Z to say that the set of variables X is conditionally independent of the set of variables.
Review: Bayesian learning and inference
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.
Probabilistic Reasoning
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Read R&N Ch Next lecture: Read R&N
Bayesian networks Chapter 14. Outline Syntax Semantics.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
Introduction to Bayesian Networks
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
Inference Algorithms for Bayes Networks
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
CS 2750: Machine Learning Directed Graphical Models
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Qian Liu CSE spring University of Pennsylvania
Read R&N Ch Next lecture: Read R&N
Conditional Probability, Bayes’ Theorem, and Belief Networks
Learning Bayesian Network Models from Data
Bayesian Networks Probability In AI.
CSCI 121 Special Topics: Bayesian Networks Lecture #2: Bayes Nets
Read R&N Ch Next lecture: Read R&N
CAP 5636 – Advanced Artificial Intelligence
Bayesian Statistics and Belief Networks
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2008
Class #16 – Tuesday, October 26
Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo
Belief Networks CS121 – Winter 2003 Belief Networks.
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Read R&N Ch Next lecture: Read R&N
Bayesian Networks CSE 573.
Presentation transcript:

Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?

Issues Representational Power –allows for unknown, uncertain information Inference –Question: What is Probability of X if E is true. –Processing: in general, exponential Acquisition or Learning –network: human input – probabilities: data+ learning

Bayesian Network Directed Acyclic Graph Nodes are RV’s Edges denote dependencies Root nodes = nodes without predecessors –prior probability table Non-root nodes –conditional probabilites for all predecessors

Bayes Net Example: Structure Burglary Earthquake John Calls Mary Calls Alarm

Probabilities Structure dictates what probabilities are needed P(B) =.001 P(-B) =.999 P(E) =.002 P(-E) =.998 etc. P(A|B&E) =.95 P(A|B&-E) =.94 P(A|-B&E) =.29 P(A|-B&-E) =.001 P(JC|A) =.90 P(JC|-A) =.05 P(MC|A) =.70 P(MC|-A) =.01

Joint Probability yields all Event = fully specified values for RVs. Prob of event: P(x1,x2,..xn) = P(x1|Parents(X1))*..P(xn|Parents(Xn)) E.g. P(j&m&a&-b&-e) = P(j|a)*P(m|a)*P(a|-b^-e)*P(-b)*P(-e) =.9*.7*.001*.999*..998 = Do this for all events and then sum as needed. Yields exact probability (assumes table right)

Many Questions With 5 boolean variables, joint probability has 2^5 entries, 1 for each event. A query corresponds to the sum of a subset of these entries. Hence 2^2^5 queries possibles. – 4 billion possible queries.

Probability Calculation Cost With 5 boolean variables need 2^5 entries. In general 2^n entries with n booleans. For Bayes Net, only need tables for all conditional probabilities and priors. If max k inputs to a node, and n RVs, then need at most n*2^k table entries. Data and computation reduced.

Example Computation Method: transform query so matches tables Bold = in a table P(Burglary|Alarm) = P(B|A) = P(A|B)*P(B)/ P(A) P(A|B) = P(A|B,E)*P(E)+P(A|B,~E)*P(~E). Done. Plug and chug.

Query Types Diagnostic: from effects to causes –P(Burglary | JohnCalls) Causal: from causes to effects –P(JohnCalls | Burglary) Explaining away: multiple causes for effect –P(Burglary | Alarm and Earthquake) Everything else

Approximate Inference Simple Sampling: logic sample Use BayesNetwork as a generative model Eg. generate million or more models, via topological order. Generates examples with appropriate distribution. Now use examples to estimate probabilities.

Logic Sampling: simulation Query: P(j&m&a&-b&-e) Topological sort Variables, i.e –Any order that preserves partial order –E.g B, E, A, MC, JC Use prob tables, in order to set values –E.g. p(B = t) =.001 => create a world with B being true once in a thousand times. –Use value of B and E to set A, then MC and JC Yields (1 million) rather than Generally huge number of simulations for small probabilities.

Sampling -> probabilities Generate examples with proper probability density. Use the ordering of the nodes to construct events. Finally count to yield an estimate of the exact probability.

Sensitivity Analysis: Confidence of Estimate Given n examples and k are heads. How many examples needed to be 99% certain that k/n is within.01 of the true p. From statistic: Mean = np, Variance = npq For confidence of.99, t = 3.25 (table) 3.25*sqrt(pq/N) N >6,400. But correct probabilities not needed, just correct ordering.

Lymphoma Diagnosis PathFinder systems 60 diseases, 130 features I: rule based, performance ok II: used mycin confidence, better III: Do Bayes Net: best IV: Better Bayes Net: (add utility theory) – outperformed experts – solved the combination of expertise problem

Summary Bayes nets easier to construct then rule- based expert systems –Years for rules, days for random variables and structure Probability theory provides sound basis for decisions –Correct probabilities still a problem Many diagnostic applications Explanation less clear: use strong influences