Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.

Slides:



Advertisements
Similar presentations
Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Advertisements

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.
Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Artificial Intelligence Universitatea Politehnica Bucuresti Adina Magda Florea
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Reasoning under Uncertainty Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Probabilistic Reasoning (2)
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Review: Bayesian learning and inference
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Bayesian networks Chapter 14 Section 1 – 2.
Bayesian Belief Networks
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Constructing Belief Networks: Summary [[Decide on what sorts of queries you are interested in answering –This in turn dictates what factors to model in.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Probabilistic Reasoning Systems Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 15 Capturing uncertain knowledge Probabilistic inference.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Probabilistic Reasoning
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Bayesian networks Chapter 14. Outline Syntax Semantics.
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Artificial Intelligence CS 165A Thursday, November 29, 2007  Probabilistic Reasoning / Bayesian networks (Ch 14)
Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
Baye’s Rule.
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Introduction to Bayesian Networks
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks.
Bayesian Networks CSE 473. © D. Weld and D. Fox 2 Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential.
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
Inference Algorithms for Bayes Networks
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Probability FOL fails for a domain due to: –Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result –Theoretical.
Belief Networks Kostas Kontogiannis E&CE 457. Belief Networks A belief network is a graph in which the following holds: –A set of random variables makes.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Probabilistic Reasoning Inference and Relational Bayesian Networks.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Web-Mining Agents Data Mining Prof. Dr. Ralf Möller Universität zu Lübeck Institut für Informationssysteme Karsten Martiny (Übungen)
Reasoning Under Uncertainty: Belief Networks
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Structure and Semantics of BN
Bayesian Statistics and Belief Networks
CS 188: Artificial Intelligence Fall 2007
Class #19 – Tuesday, November 3
Class #16 – Tuesday, October 26
Structure and Semantics of BN
Belief Networks CS121 – Winter 2003 Belief Networks.
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Bayesian Networks CSE 573.
Chapter 14 February 26, 2004.
Presentation transcript:

Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic inference

CSE 471/598 by H. Liu2 Knowledge representation Joint probability distribution can answer any question about the domain can become intractably large as #RV grows can be difficult to specify P for atomic events Conditional independence can simplify probabilistic assignment A data structure - a belief network or Bayesian network that represents the dependence between variables and gives a concise specification of the joint.

CSE 471/598 by H. Liu3 A Bayesian network is a graph: A set of random variables A set of directed links connects pairs of nodes Each node has a conditional P table that quantifies the effects that the parents have on the node The graph has no directed cycles (DAG) It is usually much easier for an expert to decide conditional dependence relationships than specifying probabilities Sometimes, experts can have very different opinions

CSE 471/598 by H. Liu4 Once the network is specified, we need only specify conditional probabilities for the nodes that participate in direct dependencies, and use those to compute any other probabilities. A simple Bayesian network (Fig 14.1) An example of burglary-alarm-call (Fig 14.2) The topology of the network can be thought of as the general structure of the causal process. Many details (Mary listening to loud music, or phone ringing and confusing John) are summarized in the uncertainty associated with the links from Alarm to JohnCalls and MaryCalls.

CSE 471/598 by H. Liu5 The probabilities actually summarize a potentially infinite set of possible circumstances Specifying the CPT for each node (Fig 14.2) A conditioning case - a possible combination of values for the parent nodes (2 n ) Each row in a CPT must sum to 1 A node with no parents has only one row (priors)

CSE 471/598 by H. Liu6 The semantics of Bayesian networks Two equivalent views of a Bayesian network Representing the JPD - helpful in understanding how to construct networks Representing conditional independence relations - helpful in designing inference procedures

CSE 471/598 by H. Liu7 Representing JPD - constructing a BN A Bayesian network provides a complete description of the domain. Every entry in the JPD can be calculated from the info in the network. A generic entry in the joint is the probability of a conjunction of particular assignments to each variable. P(x 1,…,x n )=  P(x i |Parents(x i )) (14.1) What’s the probability of the event of J^M^A^!B^!E? =P(j|a)P(m|a)P(a|!b^!e)P(!b)P(!e) Find the values in Figure 14.2 and done

CSE 471/598 by H. Liu8 A method for constructing Bayesian networks Eq 14.1 defines what a given BN means but implies certain conditional independence relationships that can be used to guide the construction. P(x 1,…,x n )=P(x n |x n-1,…,x 1 )P(x n-1,…,x 1 ) continue for P(x n-1,…,x 1 ), we get (14.2) below P(X i |X i-1,…,X 1 )=P(X i |Parents(X i )) (14.2) The BN is a correct representation of the domain only if each node is C-independent of its predecessors in the node ordering, given its parents. E.g., P(M|J,A,E,B)=P(M|A)

CSE 471/598 by H. Liu9 Incremental network construction Choose relevant variables describing the domain Choose an ordering for the variables While there are variables left: Pick a var and add a node to the network Set its parents to some minimal set of nodes already in the net to satisfy Eq.14.2 Define the CPT for the var.

CSE 471/598 by H. Liu10 Compactness A Bayesian network can often be far more compact than the full joint. In a locally structured system, each sub- component interacts directly with only a bounded number of other components. A local structure is usually associated with linear rather than exponential growth in complexity. With 30 (n) nodes, if a node is directly influenced by 5 (k) nodes, what’s the difference between BN & joint? 30*2^5 vs. 2^30, or n*2^k vs. 2^n

CSE 471/598 by H. Liu11 Node ordering The correct order to add nodes is to add the “root causes” first, then the variables they influence, and so on until we reach the leaves that have no direct causal influence on the other variables. Domain knowledge helps! What if we happen to choose the wrong order? Fig 14.3 shows an example. If we stick to a true causal model, we end up having to specify fewer numbers, and the numbers will often be easier to come up with.

CSE 471/598 by H. Liu12 Conditional independence relations Designing inference algorithms, we need to know if more general conditional independences hold. Given a network, can we know if a set of nodes X is independent of another set Y, given a set of evidence nodes E? It boils down to the concept of non-descendants. As in Fig 14.2, JohnCalls is indept of Burglary and Earthquake, given Alarm. A node is cond independent of all other nodes in the network, given its parents, children, and children’s parents (its Markov blanket). Burglary is indept of JohnCalls and MaryCalls, given Alarm and Earthquake

CSE 471/598 by H. Liu13 Representation of CPTs Given canonical distributions, the complete table can be specified by naming the distribution with some parameters. A deterministic node has its values specified exactly by the values of its parents. Uncertain relationships can often be characterized by “noisy” logical relationships. Noisy-OR (page 500) An example for determine cond probabilities starting with P(!fever) on page 501 given the individual inhibition probabilities given cold, flu, malaria as P(!fever|c,!f,!m) = 0.6, P(!fever|!c,f,!m) = 0.2, and P(!fever|!c,!f,m) = 0.1

CSE 471/598 by H. Liu14 Inference in Bayesian networks Exact inference Inference by enumeration The variable elimination algorithm The complexity of exact inference Clustering algorithms Approximate inference Direct sampling methods  Rejection sampling  Likelihood weighting Inference by Markov chain simulation

CSE 471/598 by H. Liu15 Knowledge engineering for uncertain reasoning Decide what to talk about Decide on a vocabulary of random variables Encode general knowledge about the dependence Encode a description of the specific problem instance Pose queries to the inference procedure and get answers

CSE 471/598 by H. Liu16 Other approaches to uncertain reasoning Different generations of expert systems Strict logic reasoning (ignore uncertainty) Probabilistic techniques using the full Joint Default reasoning - believed until a better reason is found to believe something else Rules with certainty factors Handling ignorance - Dempster-Shafer theory Vagueness - something is sort of true (fuzzy logic) Probability makes the same ontological commitment as logic: the event is true or false

CSE 471/598 by H. Liu17 Default reasoning The four-wheel car conclusion is reached by default. New evidence can cause the conclusion retracted, while FOL is strictly monotonic. Representatives are default logic, nonmonotonic logic, circumscription There are problematic issues Details in Chapter 10

CSE 471/598 by H. Liu18 Rule-based methods Logical reasoning systems have properties like: Monotonicity Locality Detachment Truth-functionality These properties are good for obvious computational advantages; bad as they’re inappropriate for uncertain reasoning.

CSE 471/598 by H. Liu19 Summary Reasoning properly In FOL, it means conclusions follow from premises In probability, it means having beliefs that allow an agent to act rationally Conditional independence info is vital A Bayesian network is a complete representation for the JPD, but exponentially smaller in size Bayesian networks can reason causally, diagnostically, intercausally, or combining two or more of the three. For polytrees (singly connected networks), the computational time is linear in network size.