CSE 415 -- (c) S. Tanimoto, 2008 Bayes Nets 1 Probabilistic Reasoning With Bayes’ Rule Outline: Motivation Generalizing Modus Ponens Bayes’ Rule Applying.

Slides:



Advertisements
Similar presentations
Lahore University of Management Sciences, Lahore, Pakistan Dr. M.M. Awais- Computer Science Department 1 Lecture 12 Dealing With Uncertainty Probabilistic.
Advertisements

BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
5/17/20151 Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn.
CS 484 – Artificial Intelligence1 Announcements Homework 8 due today, November 13 ½ to 1 page description of final project due Thursday, November 15 Current.
AI – CS364 Uncertainty Management Introduction to Uncertainty Management 21 st September 2006 Dr Bogdan L. Vrusias
Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) March, 17, 2010.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Bayesian Belief Networks
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Naïve Bayes Model. Outline Independence and Conditional Independence Naïve Bayes Model Application: Spam Detection.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
CSc411Artificial Intelligence1 Chapter 5 STOCHASTIC METHODS Contents The Elements of Counting Elements of Probability Theory Applications of the Stochastic.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Probability, Bayes’ Theorem and the Monty Hall Problem
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Bayes for Beginners Presenters: Shuman ji & Nick Todd.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Renaissance Risk Changing the odds in your favour Risk forecasting & examples.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
CSE PR 1 Reasoning - Rule-based and Probabilistic Representing relations with predicate logic Limitations of predicate logic Representing relations.
Monty Hall problem. Joint probability distribution  In the study of probability, given two random variables X and Y, the joint distribution of X and.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Making sense of randomness
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight.
1 Bayesian Networks: A Tutorial. 2 Introduction Suppose you are trying to determine if a patient has tuberculosis. You observe the following symptoms:
Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
CPSC 7373: Artificial Intelligence Lecture 5: Probabilistic Inference Jiang Bian, Fall 2012 University of Arkansas at Little Rock.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Probability Theory. Topics Basic Probability Concepts: Sample Spaces and Events, Simple Probability, and Joint Probability, Conditional Probability Bayes’
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
CSE (c) S. Tanimoto, 2007 Bayes Nets 1 Bayes Networks Outline: Why Bayes Nets? Review of Bayes’ Rule Combining independent items of evidence General.
Weng-Keen Wong, Oregon State University © Bayesian Networks: A Tutorial Weng-Keen Wong School of Electrical Engineering and Computer Science Oregon.
Recuperação de Informação B Modern Information Retrieval Cap. 2: Modeling Section 2.8 : Alternative Probabilistic Models September 20, 1999.
Qian Liu CSE spring University of Pennsylvania
Chapter 4 Probability.
Probability Lirong Xia Spring Probability Lirong Xia Spring 2017.
Bayesian Networks: A Tutorial
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Representing Uncertainty
Probability Topics Random Variables Joint and Marginal Distributions
Great Theoretical Ideas In Computer Science
CAP 5636 – Advanced Artificial Intelligence
Bayesian Statistics and Belief Networks
CS 188: Artificial Intelligence
Class #16 – Tuesday, October 26
Probabilistic Reasoning With Bayes’ Rule
Probability Lirong Xia.
Probabilistic Reasoning With Bayes’ Rule
Probabilistic Reasoning With Bayes’ Rule
basic probability and bayes' rule
Presentation transcript:

CSE (c) S. Tanimoto, 2008 Bayes Nets 1 Probabilistic Reasoning With Bayes’ Rule Outline: Motivation Generalizing Modus Ponens Bayes’ Rule Applying Bayes’ Rule Odds Odds-Likelihood Formulation of Bayes’ Rule Combining independent items of evidence General combination of evidence Benefits of Bayes nets for expert systems

CSE (c) S. Tanimoto, 2008 Bayes Nets 2 Motivation Logical reasoning has limitations: It requires that assumptions be considered “certain”. It typically uses general rules. General rules that are reliable may be difficult to come by. Logical reasoning can be awkward for certain structured domains such as time and space.

CSE (c) S. Tanimoto, 2008 Bayes Nets 3 Generalizing Modus Ponens Modus Ponens: P -> Q P Q Bayes’ Rule: (general idea) If P then sometimes Q P Maybe Q (Bayes’ rule lets us calculate the probability of Q, taking P into account.)

CSE (c) S. Tanimoto, 2008 Bayes Nets 4 Bayes’ Rule E: Some evidence exists, i.e., a particular condition is true H: some hypothesis is true. P(E|H) = probability of E given H. P(E|~H) = probability of E given not H. P(H) = probability of H, independent of E. P(E|H) P(H) P(H|E) = P(E) P(E) = P(E|H) P(H) + P(E|~H)(1 - P(H))

CSE (c) S. Tanimoto, 2008 Bayes Nets 5 Applying Bayes’ Rule E: The patient’s white blood cell count exceeds 110% of average. H: The patient is infected with tetanus. P(E|H) = 0.8 class-conditional probability P(E|~H) = 0.3 “ P(H) = 0.01 prior probability posterior probability: P(E|H) P(H) (0.8) (0.01) P(H|E) = = = = P(E) (0.8) (0.01) + (0.3)(0.99) P(E) = P(E|H) P(H) + P(E|~H)(1 - P(H))

CSE (c) S. Tanimoto, 2008 Bayes Nets 6 Odds Odds are 10 to 1 it will rain tomorrow P(rain) = = Suppose P(A) = 1/4 Then O(A) = (1/4) / (3/4) = 1/3 P(A) P(A) in general: O(A) = = P(~A) 1 - P(A)

CSE (c) S. Tanimoto, 2008 Bayes Nets 7 Bayes’ Rule reformulated... P(E|H) P(H) P(H|E) = P(E) _______ ______________ P(E|~H) P(~H) P(~H|E) = P(E) P(E|H) O(H|E) = O(H) P(E|~H)

CSE (c) S. Tanimoto, 2008 Bayes Nets 8 Odds-Likelihood Form of Bayes’ Rule E: The patient’s white blood cell count exceeds 110% of average. H: The patient is infected with tetanus. O(H) = 0.01/0.99 O(H|E) = λ O(H) lambda is called the sufficiency factor. O(H|~E) = λ’ O(H) lambda prime is called the necessity factor.

CSE (c) S. Tanimoto, 2008 Bayes Nets 9 The Monty Hall Problem From the Wikipedia

CSE (c) S. Tanimoto, 2008 Bayes Nets 10 The Monty Hall Problem There are three doors: a red door, green door, and blue door. Behind one is a car, and behind the other two are goats. You get to keep whatever is behind the door you choose. You choose a door (say, red). The host opens one of the other doors (say, green), which reveals a goat. The host says, “Would you like to select the OTHER door?” Should you switch?

CSE (c) S. Tanimoto, 2008 Bayes Nets 11 Discussion A: car is behind red door B: car is behind green door C: car is behind blue door P(A) = P(B) = P(C) = 1/3 Suppose D: you choose the red door, and the host opens the green door revealing a goat. Is P(C|D) = ½ ???? (No) Why not? What is P(C|D)?

CSE (c) S. Tanimoto, 2008 Bayes Nets 12 Bayes Nets A practical way to manage probabilistic inference when multiple variables (perhaps many) are involved.

CSE (c) S. Tanimoto, 2008 Bayes Nets 13 Why Bayes Networks? Reasoning about events involving many parts or contingencies generally requires that a joint probability distribution be known. Such a distribution might require thousands of parameters. Modeling at this level of detail is typically not practical. Bayes Nets require making assumptions about the relevance of some conditions to others. Once the assumptions are made, the joint distribution can be “factored” so that there are many fewer separate parameters that must be specified.

CSE (c) S. Tanimoto, 2008 Bayes Nets 14 Review of Bayes’ Rule E: Some evidence exists, i.e., a particular condition is true H: some hypothesis is true. P(E|H) = probability of E given H. P(E|~H) = probability of E given not H. P(H) = probability of H, independent of E. P(E|H) P(H) P(H|E) = P(E) P(E) = P(E|H) P(H) + P(E|~H)(1 - P(H))

CSE (c) S. Tanimoto, 2008 Bayes Nets 15 Combining Independent Items of Evidence E 1 : The patient’s white blood cell count exceeds 110% of average. E 2 : The patient’s body temperature is above 101 o F. H: The patient is infected with tetanus. O(H) = 0.01/0.99 O(H|E 1 ) = λ 1 O(H) sufficiency factor for high white cell count. O(H|E 2 ) = λ 2 O(H) sufficiency factor for high body temp. Assuming E1 and E2 are independent: O(H|E 1  E 2 ) = λ 1 λ 2 O(H)

CSE (c) S. Tanimoto, 2008 Bayes Nets 16 Bayes Net Example A: Accident (An accident blocked traffic on the highway.) B: Barb Late (Barbara is late for work). C: Chris Late (Christopher is late for work). BC A P(A) = 0.2 P(B|A) = 0.5 P(B|~A) = 0.15 P(C|A) = 0.3 P(C|~A) = 0.1

CSE (c) S. Tanimoto, 2008 Bayes Nets 17 Forward Propagation (from causes to effects) BC A P(A) = 0.2 P(B|A) = 0.5 P(B|~A) = 0.15 P(C|A) = 0.3 P(C|~A) = 0.1 Suppose A (there is an accident): Then P(B|A) = 0.5 P(C|A) = 0.3 Suppose ~A (no accident): Then P(B|~A) = 0.15 P(C|A) = 0.1 (These come directly from the given information.)

CSE (c) S. Tanimoto, 2008 Bayes Nets 18 Marginal Probabilities (using forward propagation) BC A P(A) = 0.2 P(B|A) = 0.5 P(B|~A) = 0.15 P(C|A) = 0.3 P(C|~A) = 0.1 Then P(B) = probability Barb is late in any situation = P(B|A) P(A) + P(B|~A) P(~A) = (0.5)(0.2) + (0.15)(0.8) = 0.22 Similarly P(C) = probability Chris is late in any situation = P(C|A) P(A) + P(C|~A) P(~A) = (0.3)(0.2) + (0.1)(0.8) = 0.14 Marginalizing means eliminating a contingency by summing the probabilities for its different cases (here A and ~A).

CSE (c) S. Tanimoto, 2008 Bayes Nets 19 Backward Propagation: “diagnosis” (from effects to causes) BC A P(A) = 0.2 P(B|A) = 0.5 P(B|~A) = 0.15 P(C|A) = 0.3 P(C|~A) = 0.1 Suppose B (Barb is late) What’s the probability of an accident on the highway? Use Bayes’ rule: Then P(A|B) = P(B|A) P(A) / P(B) = 0.5 * 0.2 / (0.5 * * 0.8) = 0.1 / 0.22 =

CSE (c) S. Tanimoto, 2008 Bayes Nets 20 Revising Probabilities of Consequences BC A P(A|B) = P(B|A) = 0.5 P(B|~A) = 0.15 P(C|A) = 0.3 P(C|~A) = 0.1 P(C|B) = ??? Suppose B (Barb is late). What’s the probability that Chris is also late, given this information? We already figured that P(A|B) = P(C|B) = P(C|A) P(A|B) + P(C|~A) P(~A|B) = (0.3)(0.4545) + (0.1)(0.5455) = somewhat higher than P(C)=0.14

CSE (c) S. Tanimoto, 2008 Bayes Nets 21 Handling Multiple Causes BC A P(B|A^D) = 0.9 P(B|A^~D) = 0.45 P(B|~A^D) = 0.75 P(B|~A^~D) = 0.1 D: Disease (Barb has the flu). P(D) = (These values are consistent with P(B|A) = 0.5. ) D

CSE (c) S. Tanimoto, 2008 Bayes Nets 22 Explaining Away BC A P(B|A^D) = 0.9 P(B|A^~D) = 0.45 P(B|~A^D) = 0.75 P(B|~A^~D) = 0.1 Suppose B (Barb is late). This raises the probability for each cause: P(A|B) = , P(D|B) = P(B|D) P(D)/ P(B) = Now, in addition, suppose C (Chris is late). C makes it more likely that A is true, “And this explains B.” D is now a less probable. P(B|D) = P(B|A^D)P(A) + P(B|~A^D)P(~A) = 0.78 D

CSE (c) S. Tanimoto, 2008 Bayes Nets 23 Benefits of Bayes Nets The joint probability distribution normally requires 2 n – 1 independent parameters. With Bayes Nets we only specify these parameters: 1.“root” node probabilities. e. g., P(A=true) = 0.2; P(A=false)= For each non-root node, a table of 2 k values, where k is the number of parents of that node. Typically k < Propagating probabilities happens along the paths in the net. With a full joint prob. dist., many more computations may be needed.