BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Knowledge Representation and Reasoning University "Politehnica" of Bucharest Department of Computer Science Fall 2010 Adina Magda Florea
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Artificial Intelligence Universitatea Politehnica Bucuresti Adina Magda Florea
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
BAYESIAN NETWORKS CHAPTER#4 Book: Modeling and Reasoning with Bayesian Networks Author : Adnan Darwiche Publisher: CambridgeUniversity Press 2009.
Identifying Conditional Independencies in Bayes Nets Lecture 4.
For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.
For Monday Finish chapter 14 Homework: –Chapter 13, exercises 8, 15.
1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.
Artificial Intelligence Chapter 19 Reasoning with Uncertain Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
Uncertain knowledge and reasoning
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.
PGM 2003/04 Tirgul 3-4 The Bayesian Network Representation.
Bayesian networks Chapter 14 Section 1 – 2.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 14 Jim Martin.
Bayesian Belief Networks
1 Bayesian Reasoning Chapter 13 CMSC 471 Adapted from slides by Tim Finin and Marie desJardins.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Bayesian Networks Material used 1 Random variables
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
A Brief Introduction to Graphical Models
Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.
For Wednesday Read Chapter 11, sections 1-2 Program 2 due.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Introduction to Bayesian Networks
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
CS 416 Artificial Intelligence Lecture 14 Uncertainty Chapters 13 and 14 Lecture 14 Uncertainty Chapters 13 and 14.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Probability FOL fails for a domain due to: –Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result –Theoretical.
Introduction on Graphic Models
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Artificial Intelligence Chapter 19 Reasoning with Uncertain Information Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
CS 2750: Machine Learning Directed Graphical Models
Bayesian networks Chapter 14 Section 1 – 2.
Presented By S.Yamuna AP/CSE
Artificial Intelligence Chapter 19
Professor Marie desJardins,
Bayesian Statistics and Belief Networks
CS 188: Artificial Intelligence Fall 2007
Class #21 – Monday, November 10
Biointelligence Lab School of Computer Sci. & Eng.
Class #16 – Tuesday, October 26
CS 188: Artificial Intelligence Spring 2007
Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo
Bayesian networks Chapter 14 Section 1 – 2.
Probabilistic Reasoning
Bayesian Networks CSE 573.
Presentation transcript:

BAYESIAN NETWORKS Ivan Bratko Faculty of Computer and Information Sc. University of Ljubljana

BAYESIAN NETWORKS  Bayesian networks, or belief networks: an approach to handling uncertainty in knowledge-based systems  Mathematically well-founded in probability theory, unlike many other, earlier approaches to representing uncertain knowledge  Type of problems intended for belief nets: given that some things are known to be true, how likely are some other events?

BURGLARY EXAMPLE  We have an alarm system to warn about burglary.  We have received an automatic alarm phone call; how likely it is that there actually was a burglary?  We cannot tell about burglary for sure, but characterize it probabilistically instead

BURGLARY EXAMPLE  There are a number of events involved: burglary sensor that may be triggered by burglar lightning that may also trigger the sensor alarm that may be triggered by sensor call that may be triggered by sensor

BAYES NET REPRESENTATION  There are variables (e.g. burglary, alarm) that can take values (e.g. alarm = true, burglary = false).  There are probabilistic relations among variables, e.g.: if burglary = true then it is more likely that alarm = true

EXAMPLE BAYES NET burglary lightning sensor alarm call

PROBABILISTC DEPENDENCIES AND CAUSALITY  Belief networks define probabilistic dependencies (and independencies) among the variables  They may also reflect causality (burglar triggers sensor)

EXAMPLE OF REASONING IN BELIEF NETWORK  In normal situation, burglary is not very likely.  We receive automatic warning call; since sensor causes warning call, the probability of sensor being on increases; since burglary is a cause for triggering the sensor, the probability of burglary increases.  Then we learn there was a storm. Lightning may also trigger sensor. Since lightning now also explains how the call happened, the probability of burglary decreases.

TERMINOLOGY Bayes network = belief network = probabilistic network = causal network

BAYES NETWORKS, DEFINITION  Bayes net is a DAG (direct acyclic graph)  Nodes ~ random variables  Link X Y intuitively means: “X has direct influence on Y”  For each node: conditional probability table quantifying effects of parent nodes

MAJOR PROBLEM IN HANDLING UNCERTAINTY  In general, with uncertainty, the problem is the handling of dependencies between events.  In principle, this can be handled by specifying the complete probability distribution over all possible combinations of variable values.  However, this is impractical or impossible: for n binary variables, 2 n - 1 probabilities - too many!  Belief networks enable that this number can usually be reduced in practice

BURGLARY DOMAIN  Five events: B, L, S, A, C  Complete probability distribution: p( B L S A C) =... p( ~B L S A C) =... p( ~B ~L S A C) =... p( ~B L ~S A C) =  Total: 32 probabilities

WHY BELIEF NETS BECAME SO POPULAR?  If some things are mutually independent then not all conditional probabilities are needed. p(XY) = p(X) p(Y|X), p(Y|X) needed  If X and Y independent: p(XY) = p(X) p(Y), p(Y|X) not needed!  Belief networks provide an elegant way of stating independences

EXAMPLE FROM J. PEARL Burglary Earthquake Alarm John calls Mary calls  Burglary causes alarm  Earthquake cause alarm  When they hear alarm, neighbours John and Mary phone  Occasionally John confuses phone ring for alarm  Occasionally Mary fails to hear alarm

PROBABILITIES P(B) = 0.001, P(E) = A P(J | A) A P(M | A) T 0.90 T 0.70 F 0.05 F 0.01 B E P(A | BE) T T 0.95 T F 0.95 F T 0.29 F F 0.001

HOW ARE INDEPENDENCIES STATED IN BELIEF NETS A B C D If C is known to be true, then prob. of D independent of A, B p( D | A B C) = p( D | C)

A1, A2,..... non-descendants of C B1 B2... parents of C C D1, D2,... descendants of C C is independent of C's non-descendants given C's parents p( C | A1,..., B1,..., D1,...) = p( C | B1,..., D1,...)

INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE EXAMPLE a parent of c b c e nondescendants of c d f descendant of c By applying rule about nondescendants: p(c|ab) = p(c|b) Because: c independent of c's nondesc. a given c's parents (node b)

INDEPENDENCE ON NONDESCENDANTS REQUIRES CARE But, for this Bayesian network: p(c|bdf)  p(c|bd) Athough f is c's nondesc., it cannot be ignored: knowing f, e becomes more likely; e may also cause d, so when e becomes more likely, c becomes less likely. Problem is that descendant d is given.

SAFER FORMULATION OF INDEPENDENCE C is independent of C's nondescendants given C's parents (only) and not C's descendants.

STATING PROBABILITIES IN BELIEF NETS For each node X with parents Y1, Y2,..., specify conditional probabilities of form: p( X | Y1  Y2 ...) for all possible states of Y1, Y2,... Y1 Y2 X Specify: p( X | Y1, Y2) p( X | ~Y1, Y2) p( X | Y1, ~Y2) p( X | ~Y1, ~Y2)

BURGLARY EXAMPLE p(burglary) = p(lightning) = 0.02 p(sensor | burglary  lightning) = 0.9 p(sensor | burglary  ~lightning) = 0.9 p(sensor | ~burglary  lightning) = 0.1 p(sensor | ~burglary  ~lightning) = p(alarm | sensor) = 0.95 p(alarm | ~sensor) = p(call | sensor) = 0.9 p(call | ~sensor) = 0.0

BURGLARY EXAMPLE 10 numbers plus structure of network are equivalent to = 31 numbers required to specify complete probability distribution (without structure information).

EXAMPLE QUERIES FOR BELIEF NETWORKS  p( burglary | alarm) = ?  p( burglary  lightning) = ?  p( burglary | alarm  ~lightning) = ?  p( alarm  ~call | burglary) = ?

Probabilistic reasoning in belief nets Easy in forward direction, from ancestors to descendents, e.g.: p( alarm | burglary  lightning) = ? In backward direction, from descendants to ancestors, apply Bayes' formula p( B | A) = p(B) * p(A | B) / p(A)

BAYES' FORMULA A variant of Bayes' formula to reason about probability of hypothesis H given evidence E in presence of background knowledge B:

REASONING RULES 1. Probability of conjunction: p( X1  X2 | Cond) = p( X1 | Cond) * p( X2 | X1  Cond) 2. Probability of a certain event: p( X | Y1 ...  X ...) = 1 3. Probability of impossible event: p( X | Y1 ...  ~X ...) = 0 4. Probability of negation: p( ~X | Cond) = 1 – p( X | Cond)

5. If condition involves a descendant of X then use Bayes' theorem: If Cond0 = Y  Cond where Y is a descendant of X in belief net then p(X|Cond0) = p(X|Cond) * p(Y|X  Cond) / p(Y|Cond) 6. Cases when condition Cond does not involve a descendant of X: (a) If X has no parents then p(X|Cond) = p(X), p(X) given (b) If X has parents Parents then

A SIMPLE IMPLEMENTATION IN PROLOG In: I. Bratko, Prolog Programming for Artificial Intelligence, Third edition, Pearson Education 2001(Chapter 15) An interaction with this program: ?- prob( burglary, [call], P). P = Now we learn there was a heavy storm, so: ?- prob( burglary, [call, lightning], P). P =

Lightning explains call, so burglary seems less likely. However, if the weather was fine then burglary becomes more likely: ?- prob( burglary, [call,not lightning],P). P =

COMMENTS  Complexity of reasoning in belief networks grows exponentially with the number of nodes.  Substantial algorithmic improvements required for large networks for improved efficiency.

d-SEPARATION  Follows from basic independence assumption of Bayes networks  d-separation = direction-dependent separation  Let E = set of “evidence nodes” (subset of variables in Bayes network)  Let V i, V j be two variables in the network

d-SEPARATION  Nodes V i and V j are conditionally independent given set E if E d-separates V i and V j  E d-separates V i, V j if all (undirected) paths (V i,V j ) are “blocked” by E  If E d-separates V i, V j, then V i and V j are conditionally independent, given E  We write I(V i,V j | E)  This means: p(V i,V j | E) = p(V i | E) * p(V j | E)

BLOCKING A PATH A path between V i and V j is blocked by nodes E if there is a “blocking node” V b on the path. V b blocks the path if one of the following holds:  V b in E and both arcs on path lead out of V b, or  V b in E and one arc on path leads into V b and one out, or  neither V b nor any descendant of V b is in E, and both arcs on path lead into V b

CONDITION 1 V b is a common cause: V b V i V j

CONDITION 2  V b is a “closer, more direct cause” of V j than V i is V i Vb V j

CONDITION 3  V b is not a common consequence of V i, V j V i V j V b V b not in E V d V d not in E