Download presentation
Presentation is loading. Please wait.
1
Belief Networks CS121 – Winter 2003 Belief Networks
2
Other Names Bayesian networks Probabilistic networks Causal networks
Belief Networks
3
Probabilistic Belief There are several possible worlds that are indistinguishable to an agent given some prior evidence. The agent believes that a logic sentence B is True with probability p and False with probability 1-p. B is called a belief In the frequency interpretation of probabilities, this means that the agent believes that the fraction of possible worlds that satisfy B is p The distribution (p,1-p) is the strength of B Belief Networks
4
Problem At a certain time t, the KB of an agent is some collection of beliefs At time t the agent’s sensors make an observation that changes the strength of one of its beliefs How should the agent update the strength of its other beliefs? Belief Networks
5
Toothache Example A certain dentist is only interested in two things about any patient, whether he has a toothache and whether he has a cavity Over years of practice, she has constructed the following joint distribution: Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 Belief Networks
6
Toothache Example Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 Using the joint distribution, the dentist can compute the strength of any logic sentence built with the proposition Toothache and Cavity In particular, this distribution implies that the prior probability of Toothache is 0.05 P(T) = P((TC)v(TC)) = P(TC) + P(TC) Belief Networks
7
New Evidence Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 She now makes an observation E that indicates that a specific patient x has high probability (0.8) of having a toothache, but is not directly related to whether he has a cavity Belief Networks
8
Adjusting Joint Distribution
Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 0.64 0.0126 0.16 0.1874 She now makes an observation E that indicates that a specific patient x has high probability (0.8) of having a toothache, but is not directly related to whether he has a cavity She can use this additional information to create a joint distribution (specific for x) conditional to E, by keeping the same probability ratios between Cavity and Cavity The probability of Cavity that was 0.1 is now (knowing E) Belief Networks
9
Corresponding Calculus
Toothache Toothache Cavity 0.04 0.06 Cavity 0.01 0.89 P(C|T) = P(CT)/P(T) = 0.04/0.05 Belief Networks
10
Corresponding Calculus
Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 P(C|T) = P(CT)/P(T) = 0.04/0.05 P(CT|E) = P(C|T,E) P(T|E) = P(C|T) P(T|E) C and E are independent given T Belief Networks
11
Corresponding Calculus
Toothache|E Toothache|E Cavity|E 0.04 0.06 Cavity|E 0.01 0.89 0.64 0.0126 0.16 0.1874 P(C|T) = P(CT)/P(T) = 0.04/0.05 P(CT|E) = P(C|T,E) P(T|E) = P(C|T) P(T|E) = (0.04/0.05) = 0.64 Belief Networks
12
Generalization n beliefs X1,…,Xn
The joint distribution can be used to update probabilities when new evidence arrives But: The joint distribution contains 2n probabilities Useful independence is not made explicit Belief Networks
13
Purpose of Belief Networks
Facilitate the description of a collection of beliefs by making explicit causality relations and conditional independence among beliefs Provide a more efficient way (than by using joint distribution tables) to update belief strengths when new evidence is observed Belief Networks
14
Alarm Example Five beliefs A: Alarm B: Burglary E: Earthquake
J: JohnCalls M: MaryCalls Belief Networks
15
A Simple Belief Network
Burglary Earthquake Alarm MaryCalls JohnCalls causes effects Intuitive meaning of arrow from x to y: “x has direct influence on y” Directed acyclic graph (DAG) Nodes are beliefs Belief Networks
16
Assigning Probabilities to Roots
P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls Belief Networks
17
Conditional Probability Tables
P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF Size of the CPT for a node with k parents: 2k Belief Networks
18
Conditional Probability Tables
P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF A P(J|…) TF A P(M|…) TF Belief Networks
19
What the BN Means P(x1,x2,…,xn) = Pi=1,…,nP(xi|Parents(Xi)) Burglary
P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF P(x1,x2,…,xn) = Pi=1,…,nP(xi|Parents(Xi)) A P(J|…) TF A P(M|…) TF Belief Networks
20
Calculation of Joint Probability
P(B) 0.001 P(E) 0.002 Burglary Earthquake Alarm MaryCalls JohnCalls B E P(A|…) TTFF TFTF P(JMABE) = P(J|A)P(M|A)P(A|B,E)P(B)P(E) = 0.9 x 0.7 x x x = A P(J|…) TF A P(M|…) TF Belief Networks
21
What The BN Encodes Burglary Earthquake Alarm MaryCalls JohnCalls For example, John does not observe any burglaries directly Each of the beliefs JohnCalls and MaryCalls is independent of Burglary and Earthquake given Alarm or Alarm The beliefs JohnCalls and MaryCalls are independent given Alarm or Alarm Belief Networks
22
What The BN Encodes Burglary Earthquake Alarm MaryCalls JohnCalls For instance, the reasons why John and Mary may not call if there is an alarm are unrelated Note that these reasons could be other beliefs in the network. The probabilities summarize these non-explicit beliefs Each of the beliefs JohnCalls and MaryCalls is independent of Burglary and Earthquake given Alarm or Alarm The beliefs JohnCalls and MaryCalls are independent given Alarm or Alarm Belief Networks
23
Distribution conditional to the observations made
Inference In BN Set E of evidence variables that are observed with new probability distribution, e.g., {JohnCalls,MaryCalls} Query variable X, e.g., Burglary, for which we would like to know the posterior probability distribution P(X|E) Distribution conditional to the observations made ? ? ? ? TFTF TTFF P(B|…) M J Belief Networks
24
Inference Patterns Basic use of a BN: Given new
Burglary Earthquake Alarm MaryCalls JohnCalls Diagnostic Burglary Earthquake Alarm MaryCalls JohnCalls Causal Basic use of a BN: Given new observations, compute the new strengths of some (or all) beliefs Other use: Given the strength of a belief, which observation should we gather to make the greatest change in this belief’s strength Burglary Earthquake Alarm MaryCalls JohnCalls Intercausal Burglary Earthquake Alarm MaryCalls JohnCalls Mixed Belief Networks
25
Applications http://excalibur.brc.uconn.edu/~baynet/researchApps.html
Medical diagnosis, e.g., lymph-node deseases Fraud/uncollectible debt detection Troubleshooting of hardware/software systems Belief Networks
26
Neural Networks CS121 – Winter 2003 Belief Networks
27
Function-Learning Formulation
Goal function f Training set: (xi, f(xi)), i = 1,…,n Inductive inference: find a function h that fits the point well Issues: Representation Incremental learning Neural nets Belief Networks
28
S Unit (Neuron) y = g(Si=1,…,n wi xi) xi x0 xn y wi g
g(u) = 1/[1 + exp(-a u)] Belief Networks
29
Particular Case: Perceptron
+ - S g xi x0 xn y wi y = g(Si=1,…,n wi xi) Belief Networks
30
Particular Case: Perceptron
+ - ? S g xi x0 xn y wi y = g(Si=1,…,n wi xi) Belief Networks
31
S S Neural Network Network of interconnected neurons g xi x0 xn y g xi
wi S g xi x0 xn y wi Acyclic (feed-forward) vs. recurrent networks Belief Networks
32
Two-Layer Feed-Forward Neural Network
Inputs Hidden layer Output Belief Networks
33
Backpropagation (Principle)
New example Yk = f(xk) Error function: E(w) = ||yk – Yk||2 wij(k) = wij(k-1) – e E/wij Backprojection: Update the weights of the inputs to the last layer, then the weights of the inputs to the previous layer, etc. Belief Networks
34
Issues How to choose the size and structure of networks?
If network is too large, risk of over-fitting (data caching) If network is too small, representation may not be rich enough Role of representation: e.g., learn the concept of an odd number Belief Networks
35
What is AI? Discipline that systematizes and automates intellectual tasks to create machines that: Act like humans Act rationally Think like humans Think rationally Belief Networks
36
What Have We Learned? Collection of useful methods
Connection between fields Relation between high-level (e.g., logic) and low-level (e.g., neural networks) representations Impact of hardware What is intelligence? Our techniques are better than our understanding Belief Networks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.