Presentation is loading. Please wait.

Presentation is loading. Please wait.

Special Topics In Scientific Computing

Similar presentations


Presentation on theme: "Special Topics In Scientific Computing"— Presentation transcript:

1 Special Topics In Scientific Computing
Pattern Recognition & Data Mining Lect2: Bayesian Decision Theory

2 Ref: Bishop: 1.5 Duda:

3 Decision Theory Consider, for example, a medical diagnosis problem in which we have taken an X-ray image of a patient, and we wish to determine whether the patient has cancer or not input vector x is the set of pixel intensities in the image output variable t will represent the presence of cancer, which we denote by the class C1, or the absence of cancer, which we denote by the class C2. Class C1: t=0 Class C2: t=1 P(X,t) gives us the most complete probabilistic description of the situation

4 Minimizing the misclassification rate
Example: Consider Two class C1 & C2 R1 and R2 are Real Area of C1 & C2 Class respectively Probability of Miss Classification: Good Decision should minimize P(Mistake): We should assign x to C1 if P(x,C1)>P(x,C2)

5 Applications: portfolio optimization
P(x,C1)=P(C1|x)P(x) Optimal Decision: Assign x to C1 if: P(C1|x)>P(C2|x)

6

7 General Form: For the more general case of K classes, it is slightly easier to maximize the probability of being correct, which is given by: Optimal: Assign x to Class Ci : i=argmax(P(x,Ck)), k=1,…,K Or i=argmax(P(Ck|x)), k=1,…,K

8 Minimizing the expected loss
For many applications, our objective will be more complex than simply minimizing the number of misclassifications. Consider Medical diagnosis problem: We note that, if a patient who does not have cancer is incorrectly diagnosed as having cancer, the consequences may be some patient distress plus the need for further investigations. Conversely, if a patient with cancer is diagnosed as healthy, the result may be premature death due to lack of treatment. Thus the consequences of these two types of mistake can be dramatically different. It would clearly be better to make fewer mistakes of the second kind, even if this was at the expense of making more mistakes of the first kind.

9 Minimizing the expected loss :Loss Function
Optimal Decision: Minimization of E[L]

10 Minimization of E[L] Format in Duda book:
Minimizing E[L] Minimize R(i | x) for i = 1,…, k Optimal Decision: Assign x to Ck: k=argmin{R(Ci|x)}, i=1,…,K

11 1 : deciding 1 2 : deciding 2 ik = (i | k)
Two-category classification 1 : deciding 1 2 : deciding 2 ik = (i | k) loss incurred for deciding i when the true state of nature is k Conditional risk: R(1 | x) = 11P(1 | x) + 12P(2 | x) R(2 | x) = 21P(1 | x) + 22P(2 | x)

12 Example Take action 1: “decide 1” Bayes decision rule is stated as:
if R(1 | x) < R(2 | x) Take action 1: “decide 1” This results in the equivalent rule: decide 1 if: (21- 11) P(x | 1) P(1) > (12- 22) P(x | 2) P(2) and decide 2 otherwise

13 Example

14 Reject option: Reject x if: Max (P(Ck|x)) < t t: Reject parameter

15 Discriminative Model: logistic regression
Decision Approaches: Generative: Discriminative Model: logistic regression

16 Discriminant Functions
Decision Approaches: Discriminant Functions

17 P(Ck): Priori Probability
Optimal Decision: Assign x to C1 if: P(C1|x)>P(C2|x) P(Ck): Priori Probability P(x|Ck): maximum likelihood P(Ck|x): Posterior Probability

18 Example: From sea bass vs. salmon example to “abstract” decision making problem State of nature; a priori (prior) probability State of nature (which type of fish will be observed next) is unpredictable, so it is a random variable The catch of salmon and sea bass is equiprobable P(1) = P(2) (uniform priors) P(1) + P( 2) = 1 (exclusivity and exhaustively) Prior prob. reflects our prior knowledge about how likely we are to observe a sea bass or salmon; these probabilities may depend on time of the year or the fishing area!

19 Example Bayes decision rule with only the prior information
Decide 1 if P(1) > P(2), otherwise decide 2 Error rate = Min {P(1) , P(2)} Suppose now we have a measurement or feature on the state of nature - say the fish lightness value Use of the class-conditional probability density P(x | 1) and P(x | 2) describe the difference in lightness feature between populations of sea bass and salmon

20 Maximum likelihood decision rule
Assign input pattern x to class 1 if P(x | 1) > P(x | 2), otherwise 2 How does the feature x influence our attitude (prior) concerning the true state of nature? Bayes decision rule

21 Posteriori probability
Posteriori probability, likelihood, evidence P(j , x) = P(j | x)p (x) = p(x | j) P (j) Bayes formula P(j | x) = {p(x | j) . P (j)} / p(x) where Posterior = (Likelihood. Prior) / Evidence

22 Optimal Bayes decision rule
Decide 1 if P(1 | x) > P(2 | x); otherwise decide 2 Special cases: (i) P(1) = P(2); Decide 1 if p(x | 1) > p(x | 2), otherwise 2 (ii) p(x | 1) = p(x | 2); Decide 1 if P(1) > P(2), otherwise 2


Download ppt "Special Topics In Scientific Computing"

Similar presentations


Ads by Google