Belief Networks CS121 – Winter 2003 Belief Networks.

Slides:

Advertisements

Similar presentations

Bayesian networks Chapter 14 Section 1 – 2. Outline Syntax Semantics Exact computation.

Advertisements

Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.

Artificial Intelligence Universitatea Politehnica Bucuresti Adina Magda Florea

BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.

1 Knowledge Engineering for Bayesian Networks. 2 Probability theory for representing uncertainty l Assigns a numerical degree of belief between 0 and.

1 22c:145 Artificial Intelligence Bayesian Networks Reading: Ch 14. Russell & Norvig.

Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.

Review: Bayesian learning and inference

Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.

Probabilistic Reasoning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 14 (14.1, 14.2, 14.3, 14.4) Capturing uncertain knowledge Probabilistic.

Bayesian networks Chapter 14 Section 1 – 2.

Bayesian Belief Networks

Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?

Bayesian Networks Russell and Norvig: Chapter 14 CMCS424 Fall 2003 based on material from Jean-Claude Latombe, Daphne Koller and Nir Friedman.

Belief Networks Russell and Norvig: Chapter 15 CS121 – Winter 2002.

Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)

1 Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14,

Bayesian Networks Russell and Norvig: Chapter 14 CMCS421 Fall 2006.

Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.

CSCI 121 Special Topics: Bayesian Network Lecture #1: Reasoning Under Uncertainty.

Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)

Bayesian networks Chapter 14. Outline Syntax Semantics.

Bayesian networks Chapter 14 Section 1 – 2. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.

An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati

Learning in neural networks Chapter 19 DRAFT. Biological neuron.

CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.

Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14, Sect.

Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.

Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.

Introduction to Bayesian Networks

An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati

Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):

Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.

1 Probability FOL fails for a domain due to: –Laziness: too much to list the complete set of rules, too hard to use the enormous rules that result –Theoretical.

Conditional Probability, Bayes’ Theorem, and Belief Networks CISC 2315 Discrete Structures Spring2010 Professor William G. Tanner, Jr.

CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.

Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,

Belief Networks CS121 – Winter Other Names Bayesian networks Probabilistic networks Causal networks.

PROBABILISTIC REASONING Heng Ji 04/05, 04/08, 2016.

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.

A Brief Introduction to Bayesian networks

Another look at Bayesian inference

Reasoning Under Uncertainty: Belief Networks

CS 2750: Machine Learning Review

CS 2750: Machine Learning Directed Graphical Models

Bayesian Networks Chapter 14 Section 1, 2, 4.

Russell and Norvig: Chapter 14 CMCS424 Fall 2005

Bayesian networks Chapter 14 Section 1 – 2.

Presented By S.Yamuna AP/CSE

Qian Liu CSE spring University of Pennsylvania

CS b553: Algorithms for Optimization and Learning

Computer Science Department

Read R&N Ch Next lecture: Read R&N

Conditional Probability, Bayes’ Theorem, and Belief Networks

Bayesian Networks Probability In AI.

CSCI 121 Special Topics: Bayesian Networks Lecture #2: Bayes Nets

Read R&N Ch Next lecture: Read R&N

Probabilistic Reasoning; Network-based reasoning

CS 188: Artificial Intelligence

Bayesian Statistics and Belief Networks

CS 188: Artificial Intelligence Fall 2007

CS 188: Artificial Intelligence Fall 2008

Class #19 – Tuesday, November 3

Hankz Hankui Zhuo Bayesian Networks Hankz Hankui Zhuo

Bayesian networks Chapter 14 Section 1 – 2.

Probabilistic Reasoning

Read R&N Ch Next lecture: Read R&N

Presentation transcript:

Belief Networks CS121 – Winter 2003 Belief Networks

Other Names Bayesian networks Probabilistic networks Causal networks Belief Networks

Probabilistic Belief There are several possible worlds that are indistinguishable to an agent given some prior evidence. The agent believes that a logic sentence B is True with probability p and False with probability 1-p. B is called a belief In the frequency interpretation of probabilities, this means that the agent believes that the fraction of possible worlds that satisfy B is p The distribution (p,1-p) is the strength of B Belief Networks

Problem At a certain time t, the KB of an agent is some collection of beliefs At time t the agent’s sensors make an observation that changes the strength of one of its beliefs How should the agent update the strength of its other beliefs? Belief Networks

Toothache Example A certain dentist is only interested in two things about any patient, whether he has a toothache and whether he has a cavity Over years of practice, she has constructed the following joint distribution: Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 Belief Networks

Toothache Example Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 Using the joint distribution, the dentist can compute the strength of any logic sentence built with the proposition Toothache and Cavity In particular, this distribution implies that the prior probability of Toothache is 0.05 P(T) = P((TC)v(TC)) = P(TC) + P(TC) Belief Networks

New Evidence Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 She now makes an observation E that indicates that a specific patient x has high probability (0.8) of having a toothache, but is not directly related to whether he has a cavity Belief Networks

Adjusting Joint Distribution Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 0.64 0.0126 0.16 0.1874 She now makes an observation E that indicates that a specific patient x has high probability (0.8) of having a toothache, but is not directly related to whether he has a cavity She can use this additional information to create a joint distribution (specific for x) conditional to E, by keeping the same probability ratios between Cavity and Cavity The probability of Cavity that was 0.1 is now (knowing E) 0.6526 Belief Networks

Corresponding Calculus Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 P(C|T) = P(CT)/P(T) = 0.04/0.05 Belief Networks

Corresponding Calculus Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 P(C|T) = P(CT)/P(T) = 0.04/0.05 P(CT|E) = P(C|T,E) P(T|E) = P(C|T) P(T|E) C and E are independent given T Belief Networks

Corresponding Calculus Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 0.64 0.0126 0.16 0.1874 P(C|T) = P(CT)/P(T) = 0.04/0.05 P(CT|E) = P(C|T,E) P(T|E) = P(C|T) P(T|E) = (0.04/0.05)0.8 = 0.64 Belief Networks

Generalization n beliefs X1,…,Xn The joint distribution can be used to update probabilities when new evidence arrives But: The joint distribution contains 2n probabilities Useful independence is not made explicit Belief Networks

Purpose of Belief Networks Facilitate the description of a collection of beliefs by making explicit causality relations and conditional independence among beliefs Provide a more efficient way (than by using joint distribution tables) to update belief strengths when new evidence is observed Belief Networks

Alarm Example Five beliefs A: Alarm B: Burglary E: Earthquake J: JohnCalls M: MaryCalls Belief Networks

A Simple Belief Network Burglary Earthquake Alarm MaryCalls JohnCalls causes effects Intuitive meaning of arrow from x to y: “x has direct influence on y” Directed acyclic graph (DAG) Nodes are beliefs Belief Networks

Assigning Probabilities to Roots P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls Belief Networks

Conditional Probability Tables P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF 0.95 0.94 0.29 0.001 Size of the CPT for a node with k parents: 2k Belief Networks

Conditional Probability Tables P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF 0.95 0.94 0.29 0.001 A P(J|…) TF 0.90 0.05 A P(M|…) TF 0.70 0.01 Belief Networks

What the BN Means P(x1,x2,…,xn) = Pi=1,…,nP(xi|Parents(Xi)) Burglary P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF 0.95 0.94 0.29 0.001 P(x1,x2,…,xn) = Pi=1,…,nP(xi|Parents(Xi)) A P(J|…) TF 0.90 0.05 A P(M|…) TF 0.70 0.01 Belief Networks

Calculation of Joint Probability P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF 0.95 0.94 0.29 0.001 P(JMABE) = P(J|A)P(M|A)P(A|B,E)P(B)P(E) = 0.9 x 0.7 x 0.001 x 0.999 x 0.998 = 0.00062 A P(J|…) TF 0.90 0.05 A P(M|…) TF 0.70 0.01 Belief Networks

What The BN Encodes Burglary Earthquake Alarm MaryCalls JohnCalls For example, John does not observe any burglaries directly Each of the beliefs JohnCalls and MaryCalls is independent of Burglary and Earthquake given Alarm or Alarm The beliefs JohnCalls and MaryCalls are independent given Alarm or Alarm Belief Networks

What The BN Encodes Burglary Earthquake Alarm MaryCalls JohnCalls For instance, the reasons why John and Mary may not call if there is an alarm are unrelated Note that these reasons could be other beliefs in the network. The probabilities summarize these non-explicit beliefs Each of the beliefs JohnCalls and MaryCalls is independent of Burglary and Earthquake given Alarm or Alarm The beliefs JohnCalls and MaryCalls are independent given Alarm or Alarm Belief Networks

Distribution conditional to the observations made Inference In BN Set E of evidence variables that are observed with new probability distribution, e.g., {JohnCalls,MaryCalls} Query variable X, e.g., Burglary, for which we would like to know the posterior probability distribution P(X|E) Distribution conditional to the observations made ? ? ? ? TFTF TTFF P(B|…) M J Belief Networks

Inference Patterns Basic use of a BN: Given new Burglary Earthquake Alarm MaryCalls JohnCalls Diagnostic Burglary Earthquake Alarm MaryCalls JohnCalls Causal Basic use of a BN: Given new observations, compute the new strengths of some (or all) beliefs Other use: Given the strength of a belief, which observation should we gather to make the greatest change in this belief’s strength Burglary Earthquake Alarm MaryCalls JohnCalls Intercausal Burglary Earthquake Alarm MaryCalls JohnCalls Mixed Belief Networks

Applications http://excalibur.brc.uconn.edu/~baynet/researchApps.html Medical diagnosis, e.g., lymph-node deseases Fraud/uncollectible debt detection Troubleshooting of hardware/software systems Belief Networks

Neural Networks CS121 – Winter 2003 Belief Networks

Function-Learning Formulation Goal function f Training set: (xi, f(xi)), i = 1,…,n Inductive inference: find a function h that fits the point well Issues: Representation Incremental learning Neural nets Belief Networks

S Unit (Neuron) y = g(Si=1,…,n wi xi) xi x0 xn y wi g g(u) = 1/[1 + exp(-a u)] Belief Networks

Particular Case: Perceptron + - S g xi x0 xn y wi y = g(Si=1,…,n wi xi) Belief Networks

Particular Case: Perceptron + - ? S g xi x0 xn y wi y = g(Si=1,…,n wi xi) Belief Networks

S S Neural Network Network of interconnected neurons g xi x0 xn y g xi wi S g xi x0 xn y wi Acyclic (feed-forward) vs. recurrent networks Belief Networks

Two-Layer Feed-Forward Neural Network Inputs Hidden layer Output Belief Networks

Backpropagation (Principle) New example Yk = f(xk) Error function: E(w) = ||yk – Yk||2 wij(k) = wij(k-1) – e E/wij Backprojection: Update the weights of the inputs to the last layer, then the weights of the inputs to the previous layer, etc. Belief Networks

Issues How to choose the size and structure of networks? If network is too large, risk of over-fitting (data caching) If network is too small, representation may not be rich enough Role of representation: e.g., learn the concept of an odd number Belief Networks

What is AI? Discipline that systematizes and automates intellectual tasks to create machines that: Act like humans Act rationally Think like humans Think rationally Belief Networks

What Have We Learned? Collection of useful methods Connection between fields Relation between high-level (e.g., logic) and low-level (e.g., neural networks) representations Impact of hardware What is intelligence? Our techniques are better than our understanding Belief Networks