1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.

1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia, National Taiwan University

2 Basic Assumptions The decision problem is posed in probabilistic terms All of the relevant probability values are known

3 State of Nature State of nature – A priori probability (prior) – Decision rule to judge just one fish –

4 Class-Conditional Probability Density

5 Bayes Formula

6 Posterior Probabilities

7 Bayes Decision Rule Probability of error Bayes decision rule

8 Bayes Decision Theory (1/3) CategoriesActions Loss functions Feature vector

9 Bayes Decision Theory (2/3) Bayes formula Conditional risk

10 Bayes Decision Theory (3/3) Decision function assumes one of the values Overall risk Bayes decision rule: compute the conditional risk then select the action for which is minimum then select the action for which is minimum

11 Two-Category Classification Conditional risk Decision rule: decide  1 if Likelihood ratio

12 Minimum-Error-Rate Classification If action is taken and the true state is, then the decision is correct if and in error if Error rate (the probability of error) is to be minimized Symmetrical or zero-one loss function Conditional risk

13 Minimum-Error-Rate Classification

14 Mini-max Criterion To perform well over a range of prior probability Minimize the maximum possible overall risk –So that the worst risk for any value of the priors is as small as possible

15 Mini-maximizing Risk

16 Searching for Mini-max Boundary

17 Neyman-Pearson Criterion Minimize the overall risk subject to a constraint Example –Minimize the total risk subject to

18 Discriminant Functions A classifier assigns to class if where are called discriminant functions where are called discriminant functions A discriminant function for a Bayes classifier Two discriminant functions for minimum- error-rate classification

19 Discriminant Functions

20 Two-Dimensional Two-Category Classifier

21 Dichotomizers Place a pattern in one of only two categories –cf. Polychotomizers More common to define a single duscriminant function Some particular forms

22 Univariate Normal PDF

23 Distribution with Maximum Entropy and Central Limit Theorem Entropy for discrete distribution Entropy for continuous distribution Central limit theorem –Aggregate effect of the sum of a large number of small, independent random disturbances, will lead to a Gaussian distrubution

24 Multivariate Normal PDF : d-component mean vector : d-component mean vector : d-by-d : d-by-d covariance matrix covariance matrix

25 Linear Combination of Gaussian Random Variables

26 Whitening Transform : matrix whose columns are the orthonormal eigenvectors of  : matrix whose columns are the orthonormal eigenvectors of  : diagonal matrix of the corresponding eigenvalues : diagonal matrix of the corresponding eigenvalues Whitening transform

27 Bivariate Gaussian PDF

28 Mahalanobis Distance Squared Mahalanobus distance Volume of the Hyperellipsoids of constant Mahalanobis distance r

29 Discriminant Functions for Normal Density

30 Case 1:  i =  2 I

31 Decision Boundaries

32 Decision Boundaries when P(  i )=P(  j )

33 Decision Boundaries when P(  i ) and P(  j ) are unequal

34 Case 2:  i = 

37 Case 3:  i = arbitrary

38 Decision Boundaries for One- Dimensional Case

39 Decision Boundaries for Two- Dimensional Case

40 Decision Boundaries for Three- Dimensional Case (1/2)

41 Decision Boundaries for Three- Dimensional Case (2/2)

42 Decision Boundaries for Four Normal Distributions

43 Example: Decision Regions for Two-Dimensional Gaussian Data

44 Example: Decision Regions for Two-Dimensional Gaussian Data

45 Bayes Decision Compared with Other Decision Strategies

46 Multicategory Case Probability of being correct Bayes classifier maximizes this probability by choosing the regions so that the integrand is maximal for all x –No other partitioning can yield a smaller probability of error

47 Error Bounds for Normal Densities Full calculation of the error probability is difficult for the Gaussian case –Especially in high dimensions –Discontinuous nature of the decision regions Upper bound on the error can be obtained for two-category case –By approximating the error integral analytically

48 Chernoff Bound

49 Bhattacharyya Bound

50 Chernoff Bound and Bhattacharyya Bound

51 Example: Error Bounds for Gaussian Distribution

52 Example: Error Bounds for Gaussian Distribution Bhattacharyya bound – k(1/2) = 4.11157 – P(error) < 0.0087 Chernoff bound – 0.008190 by numerical searching Error rate by numerical integration – 0.0021 –Impractical for higher dimension

53 Signal Detection Theory Internal signal in the detector x –Has mean  2 when external signal (pulse) is present –Has mean  1 when external signal is not present – p(x|  i ) ~ N(  i,  2 )

54 Signal Detection Theory

55 Four Probabilities Hit: P(x>x*|x in  2 ) False alarm: P(x>x*|x in  1 ) Miss: P(x<x*|x in  2 ) Correct reject: P(x<x*|x in  1 )

56 Receiver Operating Characteristic (ROC)

57 Bayes Decision Theory: Discrete Features

58 Independent Binary Features

59 Discriminant Function

60 Example: Three-Dimensional Binary Data

61 Example: Three-Dimensional Binary Data

62 Illustration of Missing Features

63 Decision with Missing Features

64 Noisy Features

65 Example of Statistical Dependence and Independence

66 Example of Causal Dependence State of an mobile –Temperature of engine –Pressure of brake fluid –Pressure of air in the tires –Voltages in the wires –Oil temperature –Coolant temperature –Speed of the radiator fan

67 Bayesian Belief Nets (Causal Networks)

68 Example: Belief Network for Fish

69 Simple Belief Network 1

70 Simple Belief Network 2

71 Use of Bayes Belief Nets Seek to determine some particular configuration of other variables –Given the values of some of the variables (evidence) Determine values of several query variables ( x ) given the evidence of all other variables ( e )

72 Example

73 Example

74 Naïve Bayes’ Rule (Idiot Bayes’ Rule) When the dependency relationship among the features are unknown, we generally take the simplest assumption –Features are conditionally independent given the category –Often works quite well

75 Applications in Medical Diagnosis Uppermost nodes represent a fundamental biological agent –Such as the presence of a virus or bacteria Intermediate nodes describe disease –Such as flu or emphysema Lowermost nodes describe the symptoms –Such as high temperature or coughing A physician enters measured values into the net and finds the most likely disease or cause

76 Compound Bayesian Decision

1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.

Similar presentations

Presentation on theme: "1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and.

Similar presentations

Presentation on theme: "1 Bayesian Decision Theory Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and."— Presentation transcript:

Similar presentations

About project

Feedback