Download presentation
Presentation is loading. Please wait.
Published byDuane Newman Modified over 9 years ago
1
CHAPTER 3: BAYESIAN DECISION THEORY
2
Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Use Bayes rule to calculate the probability of the classes Make rational decision among multiple actions to minimize expected risk Learning association rules from data
3
Unobservable variables Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 3 Tossing a coin is completely random process, can’t predict the outcome Only can talk about the probabilities that the outcome of the next toss will be head or tails If we have access to extra knowledge (exact composition of the coin, initial position, force etc.) the exact outcome of the toss can be predicted
4
Unobservable Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 4 Unobservable variable is the extra knowledge that we don’t have access to Coin toss: the only observable variables is the outcome of the toss X=f(z), z is unobservables, x is observables f is deterministic function
5
Bernoulli Random Variable Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 5 Result of tossing a coin is {Heads,Tails} Define a random variable X {1,0} p o the probability of heads P(X = 1) = p o and P(X = 0) = 1 − P(X = 1) = 1 − p o Assume asked to predict the next toss If know p o we would predict heads if p o >1/2 Choose more probable case to minimize probability of the error 1- p o
6
Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 6 What if we don’t know P(X) Want to estimate from given data (sample) X Realm of statistics Sample X generated from probability distribution of the observables x t Want to build an approximator p(x) using sample X In coin toss example: sample is outcomes of past N tosses and in distribution is characterized by single parameter p o
7
Parameter Estimation Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 7
8
Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 8 Credit scoring: two classes – high risk and low risk Decide on observable information: (income and saving) Have reasons to believe that these 2 variable gives us idea about the credibility of a customer Represent by two random variable X 1 and X 2 Can’t observe customer intentions and moral codes Can observe credibility of a past customer Bernoulli random variable C conditioned on X=[X 1, X 2 ] T We know P(C| X 1, X 2 )
9
Classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 9 Assume know P(C| X 1, X 2 ) New applications X 1 =x 1, X 2 =x 2
10
Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 10 Similar to coin toss but C is conditioned on two other observable variables x=[x 1,x 2 ] T The problem : Calculate P(C|x) Use Bayes rule
11
Conditional Probability Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 11 Probability of A (point will be inside A) if we know that B happens (point is inside B) P(A|B)=P(A B)/P(B)
12
Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 12 P(A|B)=P(A B)/P(B) P(B|A)= P(A B)/P(A)=>P(A B)=P(B|A)*P(A) P(A|B)=P(B|A)*P(A)/P(B)
13
Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 13 Prior: probability of a customer is high risk regardless of x. Knowledge we have as to the value of C before looking at observables x posterior likelihoodprior evidence
14
Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 14 Likelihood: probability that event in C will have observable X P(x 1,x 2 |C=1) is the probability that a high-risk customer has his X 1 =x 1,X 2 =x 2 posterior likelihoodprior evidence
15
Bayes Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 15 Evidence: P(x) probability that observation x is seen regardless if positive or negative posterior likelihoodprior evidence
16
Bayes’ Rule 16 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) posterior likelihoodprior evidence
17
Bayes Rule for classification Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 17 Assume know : prior, evidence and likelihood Will learn how to estimate them from the data later Plug them in into Bayes formula to obtain P(C|x) Choose C=1 if P(C=1|x)>P(c=0|x)
18
Bayes’ Rule: K>2 Classes 18 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
19
Bayes’ Rule Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 19 Deciding on specific input x P(x) is the same for all classes Don’t need it to compare posterior
20
Losses and Risks Decisions/Errors are not equally good or costly Actions: α i is assignment to class i Loss of α i when the state is C k : λ ik Expected risk (Duda and Hart, 1973) Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 20
21
Losses and Risks: 0/1 Loss 21 Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) For minimum risk, choose the most probable class
22
Losses and Risks: Reject Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 22
23
Discriminant Functions Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 23 Define a function g i (x) for each class ( “goodness” of selecting class C i given observables x Maximum discriminant corresponds to minimum conditional risk
24
Decision Regions Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 24 K decision regions R 1,...,R K
25
K=2 Classes g(x) = g 1 (x) – g 2 (x) Log odds: Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 25
26
Correlation and Causality Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 26
27
Interaction Between Variables Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 27 Have many variables want to represent and analyze Structure of dependency among them Conditional and marginal probabilities Use graphical representation Nodes (events) and arcs (dependencies) Numbers of nodes and arcs (conditional probabilities)
28
Causes and Bayes’ Rule Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 28 Diagnostic inference: Knowing that the grass is wet, what is the probability that rain is the cause? causal diagnostic
29
Causal vs Diagnostic Inference Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 29 Causal inference: If the sprinkler is on, what is the probability that the grass is wet? P(W|S) = P(W|R,S) P(R|S) + P(W|~R,S) P(~R|S) = P(W|R,S) P(R) + P(W|~R,S) P(~R) = 0.95 0.4 + 0.9 0.6 = 0.92
30
Causal vs Diagnostic Inference Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 30 Diagnostic inference: If the grass is wet, what is the probability that the sprinkler is on? P(S|W) = 0.35 > 0.2 P(S) P(S|R,W) = 0.21 Explaining away: Knowing that it has rained decreases the probability that the sprinkler is on.
31
Bayesian Networks: Causes Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 31
32
Bayesian Nets: Local structure 32 Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) Only need to specify interactions between neighboring events
33
Bayesian Networks: Classification Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 33 diagnostic P (C | x ) Bayes’ rule inverts the arc:
34
Naive Bayes’ Classifier Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 34 Given C, x j are independent: p(x|C) = p(x 1 |C) p(x 2 |C)... p(x d |C)
35
Naïve Bayes’ Classifier Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 35 Why naïve? Ignores dependency among the inputs Dependency among “baby food” and “diapers” Hidden variables (“have children”) Insert node/arcs and estimate their values from data
36
Association Rules Association rule: X Y Support (X Y): Confidence (X Y): Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 36
37
Association Rule Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 37 Only one customer bought chips Same customer bought beer P(C|B)=1 But support is tiny Support shows statistical significance
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.