Download presentation
Presentation is loading. Please wait.
Published byMelvin Young Modified over 9 years ago
1
CHAPTER 5 Probability Theory (continued) Introduction to Bayesian Networks
2
Joint Probability
3
Marginal Probability
4
Conditional Probability
5
The Chain Rule I
6
Bayes’ Rule
7
More Bayes’ Rule
8
The Chain Rule II
9
Independence
10
Example: Independence
11
Example: Independence?
12
Conditional Independence
14
The Chain Rule III
15
Expectations
16
Expectations
17
Estimation
18
Estimation Problems with maximum likelihood estimates: - If I flip a coin once, and it’s heads, what’s the estimate for P(heads)? - What if I flip it 50 times with 27 heads? - What if I flip 10M times with 8M heads? Basic idea: - We have some prior expectation about parameters (here, the probability of heads) - Given little evidence, we should skew toward prior - Given lots of evidence, we should listen to data How can we accomplish this? Stay tuned!
19
Lewis Carroll's Pillow Problem
21
Bayesian Networks: Big Picture Two big problems with joint probability distributions: - Unless there are only a few variables, the distribution is too big to represent explicitly (Why?) - Hard to estimate anything empirically about more than a few variables at a time (Why?) Hard to compute answers to queries of the form P(y | a) (Why?) Bayesian networks are a technique for describing complex joint distributions (models) using a bunch of simple, local distributions - It describes how variables interact locally - Local interactions chain together to give global, indirect interactions - For about 10 min, we’ll be very vague about how these interactions are specified
22
Graphical Model Notation
23
Example: Coin Flips
24
Example: Traffic
25
Example: Traffic II
26
Example: Alarm Network
27
Bayesian Network Semantics
28
Example: Alarm Network
29
Size of a Bayes’ Net How big is a joint distribution over N Boolean variables? 2N How big is a Bayes net if each node has k parents? N 2k Both give you the power to calculate P(X1,X2,…,Xn) Bayesian Networks = Huge space savings! Also easier to elicit local CPTs Also turns out to be faster to answer queries (future class)
30
Building the (Entire) Joint
31
Example: Traffic
32
Example: Reverse Traffic
33
Causality? When Bayes’ nets reflect the true causal patterns: Often simpler (nodes have fewer parents) Often easier to think about Often easier to elicit from experts BNs need not actually be causal Sometimes no causal net exists over the domain E.g. consider the variables Traffic and RoofDrips End up with arrows that reflect correlation, not causation What do the arrows really mean? Topology may happen to encode causal structure Topology really encodes conditional independencies
34
Creating Bayes’ Nets So far, we talked about how any fixed Bayes’ net encodes a joint distribution Next: how to represent a fixed distribution as a Bayes’ net Key ingredient: conditional independence The exercise we did in “causal” assembly of BNs was a kind of intuitive use of conditional independence Now we have to formalize the process After that: how to answer queries (inference)
35
Conditional Independence
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.