17 th October --Project 2 can be submitted until Thursday (in-class) --Homework 3 due Thursday --Midterm next Thursday (10/26)

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
3/19. Conditional Independence Assertions We write X || Y | Z to say that the set of variables X is conditionally independent of the set of variables.
3/24 Project 3 released; Due in two weeks. Blog Questions You have been given the topology of a bayes network, but haven't yet gotten the conditional.
10/24  Exam on 10/26 (Lei Tang and Will Cushing to proctor)
Probabilistic Calculus to the Rescue Suppose we know the likelihood of each of the (propositional) worlds (aka Joint Probability distribution ) Then we.
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
PGM 2003/04 Tirgul 3-4 The Bayesian Network Representation.
Bayesian networks Chapter 14 Section 1 – 2.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 14 Jim Martin.
Bayesian Belief Networks
11/18 Everything is fine.. Everything is fine… Everything is fine…
10/15  Homework 3 due 10/17  Mid-term on 10/24  Project 2 accepted until 10/26 (4 days automatic extension)
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
CS 188: Artificial Intelligence Spring 2009 Lecture 15: Bayes’ Nets II -- Independence 3/10/2009 John DeNero – UC Berkeley Slides adapted from Dan Klein.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
10/22  Homework 3 returned; solutions posted  Homework 4 socket opened  Project 3 assigned  Mid-term on Wednesday  (Optional) Review session Tuesday.
Bayesian Networks Alan Ritter.
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
Probabilistic Propositional Logic Nov 6 th. Need for modeling uncertainity Consider a simple scenario: You know that rain makes grass wet. Sprinklers.
Probabilistic Reasoning
EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS
Advanced Artificial Intelligence
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Bayes’ Nets  A Bayes’ net is an efficient encoding of a probabilistic model of a domain  Questions we can ask:  Inference: given a fixed BN, what is.
Bayesian networks Chapter 14. Outline Syntax Semantics.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14, Sect.
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Introduction to Bayesian Networks
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Announcements Project 4: Ghostbusters Homework 7
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Artificial Intelligence Bayes’ Nets: Independence Instructors: David Suter and Qince Li Course Harbin Institute of Technology [Many slides.
CS 2750: Machine Learning Directed Graphical Models
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Qian Liu CSE spring University of Pennsylvania
Bayesian networks (1) Lirong Xia Spring Bayesian networks (1) Lirong Xia Spring 2017.
Conditional Probability, Bayes’ Theorem, and Belief Networks
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence
CS 188: Artificial Intelligence Spring 2007
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
CS 188: Artificial Intelligence Spring 2006
Warm-up as you walk in Each node in a Bayes net represents a conditional probability distribution. What distribution do you get when you multiply all of.
Bayesian networks (1) Lirong Xia. Bayesian networks (1) Lirong Xia.
CS 188: Artificial Intelligence Fall 2008
Presentation transcript:

17 th October --Project 2 can be submitted until Thursday (in-class) --Homework 3 due Thursday --Midterm next Thursday (10/26)

What happens if there are multiple symptoms…? Patient walked in and complained of toothache You assess P(Cavity|Toothache) Now you try to probe the patients mouth with that steel thingie, and it catches… How do we update our belief in Cavity? P(Cavity|TA, Catch) = P(TA,Catch| Cavity) * P(Cavity) P(TA,Catch) =  P(TA,Catch|Cavity) * P(Cavity) Need to know this! If n evidence variables, We will need 2 n probabilities! Conditional independence To the rescue Suppose P(TA,Catch|cavity) = P(TA|Cavity)*P(Catch|Cavity)

Generalized bayes rule P(A|B,e) = P(B|A,e) P(A|e) P(B|e)

Conditional Independence Assertions We write X || Y | Z to say that the set of variables X is conditionally independent of the set of variables Y given evidence on the set of variables Z (where X,Y,Z are subsets of the set of all random variables in the domain model) We saw that Bayes Rule computations can exploit conditional independence assertions. Specifically, –X || Y| Z implies P(X & Y|Z) = P(X|Z) * P(Y|Z) P(X|Y, Z) = P(X|Z) P(Y|X,Z) = P(Y|Z) Idea: Why not write down all conditional independence assertions that hold in a domain?

Cond. Indep. Assertions (Contd) Idea: Why not write down all conditional independence assertions (CIA) (X || Y | Z) that hold in a domain? Problem: There can be exponentially many conditional independence assertions that hold in a domain (recall that X, Y and Z are all subsets of the domain variables. Brilliant Idea: May be we should implicitly specify the CIA by writing down the “dependencies” between variables using a graphical model –A Bayes Network is a way of doing just this. The Baye Net is a Directed Acyclic Graph whose nodes are random variables, and the immediate dependencies between variables are represented by directed arcs –The topology of a bayes network shows the inter-variable dependencies. Given the topology, there is a way of checking if any Cond. Indep. Assertion. holds in the network (the Bayes Ball algorithm and the D-Sep idea)

Cond. Indep. Assertions (contd) We said that a bayes net implicitly represents a bunch of CIA Qn. If I tell you exactly which CIA hold in a domain, can you give me a bayes net that exactly models those and only those CIA? –Unfortunately, NO. (See the X,Y,Z,W blog example) –This is why there is another type of graphical models called “undirected graphical models” In an undirected graphical model, also called a markov random field, nodes correspond to random variables, and the immediate dependencies between variables are represented by undirected edges. –The CIA modeled by an undirected graphical model are different »X || Y | Z in an undirected graph if every path from a node in X to a node in Y must pass through a node in Z (so if we remove the nodes in Z, then X and Y will be disconnected) –Undirected models are good to represent “soft constraints” between random variables (e.g. the correlation between different pixels in an image) while directed models are good for representing causal influences between variables

CIA implicit in Bayes Nets So, what conditional independence assumptions are implicit in Bayes nets? –Local Markov Assumption: A node N is independent of its non-descendants (including ancestors) given its immediate parents. (So if P are the immediate paretnts of N, and A is the set of Ancestors, then {N} || A| P ) (Equivalently) A node N is independent of all other nodes given its markov blanked (parents, children, children’s parents) –Given this assumption, many other conditional independencies follow. For a full answer, we need to appeal to D-Sep condition and/or Bayes Ball reachability

Topological Semantics Independence from Non-descedants holds Given just the parents Independence from Every node holds Given markov blanket These two conditions are equivalent Many other coniditional indepdendence assertions follow from these

Directly using Joint Distribution Directly using Bayes rule Using Bayes rule With bayes nets Takes O(2 n ) for most natural queries of type P(D|Evidence) NEEDS O(2 n ) probabilities as input Probabilities are of type P(w k )—where w k is a world Can take much less than O(2 n ) time for most natural queries of type P(D|Evidence) STILL NEEDS O(2 n ) probabilities as input Probabilities are of type P(X 1..X n |Y) Can take much less than O(2 n ) time for most natural queries of type P(D|Evidence) Can get by with anywhere between O(n) and O(2 n ) probabilities depending on the conditional independences that hold. Probabilities are of type P(X 1..X n |Y)

Review

Bayes Ball Alg:  Shade the evidence nodes  Put one ball each at the X nodes  See if any if them can find their way to any of the Y nodes

10/19

4 th Nov, Happy Deepavali! 10/19

Blog Questions You have been given the topology of a bayes network, but haven't yet gotten the conditional probability tables (to be concrete, you may think of the pearl alarm-earth quake scenario bayes net). Your friend shows up and says he has the joint distribution all ready for you. You don't quite trust your friend and think he is making these numbers up. Is there any way you can prove that your friends' joint distribution is not correct? Answer: – Check to see if the joint distribution given by your friend satisfies all the conditional independence assumptions. –For example, in the Pearl network, Compute P(J|A,M,B) and P(J|A). These two numbers should come out the same! Notice that your friend could pass all the conditional indep assertions, and still be cheating re: the probabilities –For example, he filled up the CPTs of the network with made up numbers (e.g. P(B)=0.9; P(E)=0.7 etc) and computed the joint probability by multiplying the CPTs. This will satisfy all the conditional indep assertions..! –The main point to understand here is that the network topology does put restrictions on the joint distribution.

Blog Questions (2) Continuing bad friends, in the question above, suppose a second friend comes along and says that he can give you the conditional probabilities that you want to complete the specification of your bayes net. You ask him a CPT entry, and pat comes a response--some number between 0 and 1. This friend is well meaning, but you are worried that the numbers he is giving may lead to some sort of inconsistent joint probability distribution. Is your worry justified ( i.e., can your friend give you numbers that can lead to an inconsistency?) (To understand "inconsistency", consider someone who insists on giving you P(A), P(B), P(A&B) as well as P(AVB) and they wind up not satisfying the P(AVB)= P(A)+P(B) -P(A&B) [or alternately, they insist on giving you P(A|B), P(B|A), P(A) and P(B), and the four numbers dont satisfy the bayes rule] Answer: No—as long as we only ask the friend to fill up the CPTs in the bayes network, there is no way the numbers won’t makeup a consistent joint probability distribution –This should be seen as a feature.. Also we had a digression about personal probabilities.. –John may be an optimist and believe that P(burglary)=0.01 and Tom may be a pessimist and believe that P(burglary)=0.99 –Bayesians consider both John and Tom to be fine (they don’t insist on an objective frequentist interpretation for probabilites) –However, Bayesians do think that John and Tom should act consistently with their own beliefs For example, it makes no sense for John to go about installing tons of burglar alarms given his belief, just as it makes no sense for Tom to put all his valuables on his lawn

Blog Questions (3) Your friend heard your claims that Bayes Nets can represent any possible conditional independence assertions exactly. He comes to you and says he has four random variables, X, Y, W and Z, and only TWO conditional independence assertions: X.ind. Y | {W,Z} W.ind. Z | {X, Y} He dares you to give him a bayes network topology on these four nodes that exactly represents these and only these conditional independencies. Can you? (Note that you only need to look at 4 vertex directed graphs). Answer: No this is not possible. Here are two “wrong” answers –Consider a disconnected graph where X, Y, W, Z are all unconnected. In this graph, the two CIAs hold. However, unfortunately so do many other CIAs –Consider a graph where W and Z are both immediate parents of X and Y. In this case, clearly, X.ind. Y| {W,Z}. However, W and Z are definitely dependent given X and Y (Explainign away). Undirected models can capture these CIA exactly. Consider a graph X is connected to W and Z; and Y is connected to W and Z (sort of a diamond). –In undirected models CIA is defined in terms of graph separability –Since X and Y separate W and Z (i.e., every path between W and Z must pass through X and Y), W.ind. Z|{X,Y}. Similarly the other CIA Undirected graphs will be unable to model some scenarios that directed ones can; so you need both…

Blog Comments.. arcarc said... I'd just like to say that this project is awesome. And not solely because it doesn't involve LISP, either! “Can’t leave well enough alone” rejoinder

CIA implicit in Bayes Nets So, what conditional independence assumptions are implicit in Bayes nets? –Local Markov Assumption: A node N is independent of its non-descendants (including ancestors) given its immediate parents. (So if P are the immediate paretnts of N, and A is the set of Ancestors, then {N} || A| P ) (Equivalently) A node N is independent of all other nodes given its markov blanked (parents, children, children’s parents) –Given this assumption, many other conditional independencies follow. For a full answer, we need to appeal to D-Sep condition and/or Bayes Ball reachability

Topological Semantics Independence from Non-descedants holds Given just the parents Independence from Every node holds Given markov blanket These two conditions are equivalent Many other coniditional indepdendence assertions follow from these

Bayes Ball Alg:  Shade the evidence nodes  Put one ball each at the X nodes  See if any if them can find their way to any of the Y nodes

D-sep (direction dependent Separation) (for those who don’t like Bayes Balls) X || Y | E if every undirected path from X to Y is blocked by E –A path is blocked if there is a node Z on the path s.t. 1.Z is in E and Z has one arrow coming in and another going out 2.Z is in E and Z has both arrows going out 3.Neither Z nor any of its descendants are in E and both path arrows lead to Z

Alarm P(A|J,M) =P(A)? Burglary Earthquake How many probabilities are needed? 13 for the new; 10 for the old Is this the worst?

Making the network Sparse by introducing intermediate variables Consider a network of boolean variables where n parent nodes are connected to m children nodes (with each parent influencing each child). –You will need n+m*2 n conditional probabilities Suppose you realize that what is really influencing the child nodes is some single aggregate function on the parent’s values (e.g. sum of the parents). –We can introduce a single intermediate node called “sum” which has links from all the n parent nodes, and separately influences each of the m child nodes You will wind up needing only n+2 n +2m conditional probabilities to specify this new network!

We only consider the failure to cause Prob of the causes that hold How about Noisy And? (hint: A&B => ~( ~A V ~B) ) k i=j+1 riri Prob that X holds even though i th parent doesn’t

Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in the network Decide on a vocabulary of the variables and their domains for the problem –Introduce “Hidden” variables into the network as needed to make the network “sparse” Decide on an order of introduction of variables into the network –Introducing variables in causal direction leads to fewer connections (sparse structure) AND easier to assess probabilities Try to use canonical distributions to specify the CPTs –Noisy-OR –Parameterized discrete/continuous distributions Such as Poisson, Normal (Gaussian) etc

Case Study: Pathfinder System Domain: Lymph node diseases –Deals with 60 diseases and 100 disease findings Versions: –Pathfinder I: A rule-based system with logical reasoning –Pathfinder II: Tried a variety of approaches for uncertainity Simple bayes reasoning outperformed –Pathfinder III: Simple bayes reasoning, but reassessed probabilities –Parthfinder IV: Bayesian network was used to handle a variety of conditional dependencies. Deciding vocabulary: 8 hours Devising the topology of the network: 35 hours Assessing the (14,000) probabilities: 40 hours –Physician experts liked assessing causal probabilites Evaluation: 53 “referral” cases –Pathfinder III: 7.9/10 –Pathfinder IV: 8.9/10 [Saves one additional life in every 1000 cases!] –A more recent comparison shows that Pathfinder now outperforms experts who helped design it!!