CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 14 Jim Martin.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Review: Bayesian learning and inference
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 12 Jim Martin.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 15 Jim Martin.
University College Cork (Ireland) Department of Civil and Environmental Engineering Course: Engineering Artificial Intelligence Dr. Radu Marinescu Lecture.
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
CS 188: Artificial Intelligence Fall 2006 Lecture 17: Bayes Nets III 10/26/2006 Dan Klein – UC Berkeley.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
1 Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14,
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS
Bayesian Networks Material used 1 Random variables
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Bayesian networks Chapter 14. Outline Syntax Semantics.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14, Sect.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Probability FOL fails for a domain due to: –Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result –Theoretical.
Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
PROBABILISTIC REASONING Heng Ji 04/05, 04/08, 2016.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
A Brief Introduction to Bayesian networks
CS 188: Artificial Intelligence Spring 2007
Bayesian Networks Chapter 14 Section 1, 2, 4.
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Qian Liu CSE spring University of Pennsylvania
Conditional Probability, Bayes’ Theorem, and Belief Networks
Bayesian Networks Probability In AI.
Probabilistic Reasoning; Network-based reasoning
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Fall 2007
CS 188: Artificial Intelligence Fall 2008
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence Fall 2008
Class #16 – Tuesday, October 26
Belief Networks CS121 – Winter 2003 Belief Networks.
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Bayesian Networks CSE 573.
Presentation transcript:

CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 14 Jim Martin

CSCI 5582 Fall 2006 Today 10/17 Review basics More on independence Break Bayesian Belief Nets

CSCI 5582 Fall 2006 Review Joint Distributions Atomic Events Independence assumptions

CSCI 5582 Fall 2006 Review: Joint Distribution Toothache=TrueToothache=False Cavity True Cavity False Each cell represents a conjunction of the variables in the model.

CSCI 5582 Fall 2006 Atomic Events The entries in the table represent the probabilities of atomic events –Events where the values of all the variables are specified

CSCI 5582 Fall 2006 Independence Two variables A and B are independent iff P(A|B) = P(A). In other words, knowing B gives you no information about B. Or P(A^B)=P(A|B)P(B)=P(A)P(B) –I.e. Two coin tosses

CSCI 5582 Fall 2006 Mental Exercise With a fair coin which of the following two sequences is more likely? –HHHHHTTTTT –HTTHHHTHTT

CSCI 5582 Fall 2006 Conditional Independence Consider the dentist problem with 3 variables: cavity, toothache, catch If I have a cavity, then the chances that there will be a catch is independent of whether or not I have a toothache as well. I.e. –P(Catch|Cavity^Toothache)= P(Catch|Cavity)

CSCI 5582 Fall 2006 Conditional Independence Remember that having the joint distribution over N variables allows you to answer all the questions involving those variables. Exploiting conditional independence allows us to represent the complete joint distribution with fewer entries. –I.e. Fewer than the 2 N normally needed

CSCI 5582 Fall 2006 Conditional Independence P(Cavity,Catch,Toothache) = P(Cavity)P(Catch,Toothache|Cavity) = P(Cavity)P(Catch|Cavity)P(Toothache|Cavity)

CSCI 5582 Fall 2006 Conditional Independence P(Cavity,Catch,Toothache) = P(Catch)P(Cavity,Toothache|Catch)  Huh?

CSCI 5582 Fall 2006 Bayesian Belief Nets A compact notation for representing conditional independence assumptions and hence a compact way of representing a joint distribution. Syntax: –A directed acyclic graph, one node per variable –Each node augmented with local conditional probability tables

CSCI 5582 Fall 2006 Bayesian Belief Nets Nodes with no incoming arcs (root nodes) simply have priors associated with them Nodes with incoming arcs have tables enumerating the –P(Node|Conjunction of Parents) –Where parent means the node at the other end of the incoming arc

CSCI 5582 Fall 2006 Alarm Example Variables: Burglar, MaryCalls, JohnCalls, Earthquake, Alarm Network topology captures the domain causality (conditional independence assumptions).

CSCI 5582 Fall 2006 Alarm Example

CSCI 5582 Fall 2006 Bayesian Belief Nets: Semantics The full joint distribution for the N variables in a Belief Net can be recovered from the information in the tables.

CSCI 5582 Fall 2006 Belief Net Semantics Alarm Example What are the chances of John calls, Mary calls, alarm is going off, no burglary, no earthquake?

CSCI 5582 Fall 2006 Alarm Example

CSCI 5582 Fall 2006 Alarm Example P(J^M^A^~B^~E)= P(J|A)*P(M|A)*P(A|~B^~E)*P(~B)*P(~E) 0.9 * 0.7 *.001 *.999 *.998 In other words, the probability of atomic events can be read right off the network as the product of the probability of the entries for each variable

CSCI 5582 Fall 2006 Events What about non-atomic events? Remember to partition. Any event can be defined as a combination of other more well-specified events. P(A) = P(A^B)+P(A^~B) So what’s the probability that Mary calls out of the blue?

CSCI 5582 Fall 2006 Events P(M ^J^E^B^A)+ P(M^J^E^B^~A)+ P(M^J^E^~B^A)+ …

CSCI 5582 Fall 2006 Events How about P(M|Alarm)? –Trick question… that’s something we know How about P(M|Earthquake)? –Not directly in the network rewrite as P(M^Earthquake)/P(Earthquake)

CSCI 5582 Fall 2006 Simpler Examples Let’s say we have two variables A and B, and we know B influences A. What’s P(A^B)? B A P(B) P(A|B) P(A|~B)

CSCI 5582 Fall 2006 Simple Example Now I tell you that B has happened. What’s you belief in A? B A P(B) P(A|B) P(A|~B)

CSCI 5582 Fall 2006 Simple Example Suppose instead I say A has happened What’s you belief in B? B A P(B) P(A|B) P(A|~B)

CSCI 5582 Fall 2006 Simple Example P(B|A)=P(B^A)/P(A) = P(B^A)/P(A^B)+P(A^~B) =P(B)P(A|B) P(B)P(A|B)+P(~B)P(A|~B)

CSCI 5582 Fall 2006 Chain Rule Basis P(B,E,A,J,M) P(M|B,E,A,J)P(B,E,A,J) P(J|B,E,A)P(B,E,A) P(A|B,E)P(B,E) P(B|E)P(E)

CSCI 5582 Fall 2006 Chain Rule Basis P(B,E,A,J,M) P(M|B,E,A,J)P(J|B,E,A)P(A|B,E)P(B|E)P(E) P(M|A) P(J|A) P(A|B,E)P(B)P(E)

CSCI 5582 Fall 2006 Alarm Example

CSCI 5582 Fall 2006 Details Where do the graphs come from? –Initially, the intuitions of domain experts Where do the numbers come from? –Hopefully, from hard data –Sometimes from experts intuitions How can we compute things efficiently? –Exactly by not redoing things unnecessarily –By approximating things

CSCI 5582 Fall 2006 Break Readings for probability –13: All –14: , 500, Sec 14.4

CSCI 5582 Fall 2006 Noisy-Or Even with the reduction in the number of probabilities needed it’s hard to accumulate all the numbers you need. Especially true when some evidence variables are shared among many causes. The Noisy-Or hack is a useful short- cut. P(A|C1^C2^C3)

CSCI 5582 Fall 2006 Noisy-Or Cold Flu Malaria Fever

CSCI 5582 Fall 2006 Noisy Or P(Fever|Cold) P(Fever|Malaria) P(Fever|Flu) P(~Fever|Cold) P(~Fever|Malaria) P(~Fever|Flu)

CSCI 5582 Fall 2006 Noisy Or What does it mean for the to occur? It means the cause was true and the symptom didn’t happen What’s the probability of that? –P(~Fever|Cause) P(~Fever|Flu), etc

CSCI 5582 Fall 2006 Noisy Or If all three causes are true and you don’t have a fever then all three blockers are in effect What’s the probability of that? –P(~Fever|flu,cold,malaria) –P(~Fever|flu)P(~Fever|cold)P(~Fever|malaria) But 1 – that = P(Fever|causes)

CSCI 5582 Fall 2006 Computing with BBNs Normal scenario –You have a belief net consisting of a bunch of variables Some of which you know to be true (evidence) Some of which you’re asking about (query) Some you haven’t specified (hidden)

CSCI 5582 Fall 2006 Example Probability that there’s a burglary given that John and Mary are calling P(B|J,M) –B is the query variable –J and M are evidence variables –A and E are hidden variables

CSCI 5582 Fall 2006 Example Probability that there’s a burglary given that John and Mary are calling P(B|J,M) = alpha P(B,J,M) = alpha * P(B,J,M,A,E) + P(B,J,M,~A,E)+ P(B,J,M,A,~E)+ P(B,J,M, ~A,~E)

CSCI 5582 Fall 2006 From the Network

CSCI 5582 Fall 2006 Expression Tree

CSCI 5582 Fall 2006 Speedups Don’t recompute things. –Dynamic programming Don’t compute some things at all –Ignore variables that can’t effect the outcome.

CSCI 5582 Fall 2006 Example John calls given burglary P(J|B)

CSCI 5582 Fall 2006 Variable Elimination Every variable that is not an ancestor of a query variable or an evidence variable is irrelevant to the query

CSCI 5582 Fall 2006 Next Time Finish Chapters 13 and 14