Download presentation
Presentation is loading. Please wait.
1
Bayesian Classification
A reference
2
Copyright, G. A. Tagliarini, PhD
Example1: 2/24/2019 Copyright, G. A. Tagliarini, PhD
3
Copyright, G. A. Tagliarini, PhD
Example 1 (continued) Overall objective: count the number of people on the beach Intermediate objectives: Reduce the search space Segment the image into three zones (classes) Surf, Beach, and Building 2/24/2019 Copyright, G. A. Tagliarini, PhD
4
Copyright, G. A. Tagliarini, PhD
Example 1 (continued) Consider a randomly selected pixel x from the image Suppose the a priori probabilities with respect to the three classes are: P(x is in the building area) 0.17 P(x is in the beach area) 0.58 P(x is in the surf area) 0.25 What decision rule minimizes error? 2/24/2019 Copyright, G. A. Tagliarini, PhD
5
Copyright, G. A. Tagliarini, PhD
Example 1: Suppose additional information regarding a property (such as color, brightness, or variability) of the pixel (or its neighborhood) is available. Can such knowledge aid classification? What is p(the pixel x came from the beach area given the pixel is red), i.e., p(x | red)? 2/24/2019 Copyright, G. A. Tagliarini, PhD
6
Copyright, G. A. Tagliarini, PhD
Example 1: Consider the hypothetical, regional color distributions h 2/24/2019 Copyright, G. A. Tagliarini, PhD
7
Copyright, G. A. Tagliarini, PhD
Example 1: The joint probability that a randomly selected pixel is from the beach area and has a hue h, written p(beach, h) = p(h|beach) P(beach) = P(beach|h) p(h) Solving for P(beach|h) we get P(beach|h) = p(h|beach) P(beach) / p(h) p(h) = p(h|building) P(building) p(h|beach) P(beach) + p(h|surf) P(surf) 1. With a priori probabilities consider p(card is red and a king) then p(red and king)=2/52 = p(red|king)*p(king)=2/4 * 4/52 = 1/26 =p(king|red)*p(red) = 2/26 * 26/52 = 2/52. 2. p(h) merely accumulates a weighted average of the occurrences of hue h, since the areas are assume to be independent, which is simply non-overlapping in this example. 2/24/2019 Copyright, G. A. Tagliarini, PhD
8
Copyright, G. A. Tagliarini, PhD
A General Formulation 2/24/2019 Copyright, G. A. Tagliarini, PhD
9
Copyright, G. A. Tagliarini, PhD
A Casual Formulation The prior probability reflects knowledge of the relative frequency of instances of a class The likelihood is a measure of the probability that a measurement value occurs in a class The evidence is a scaling term 2/24/2019 Copyright, G. A. Tagliarini, PhD
10
Copyright, G. A. Tagliarini, PhD
Forming a Classifier Create discriminant functions gi(x) for each class i = 1,…,c Not unique Partition measurement space with crisp boundaries Assign x to class k if gk(x) > gj(x) for all k ≠ j For a minimum error classifier, gi(x)=P(i|x) 2/24/2019 Copyright, G. A. Tagliarini, PhD
11
Equivalent Discriminants
If f is monotone increasing, the collection hi(x) = f(gi(x)), i = 1,…,c forms an equivalent family of discriminant functions, e.g., 2/24/2019 Copyright, G. A. Tagliarini, PhD
12
Gaussian Distributions
2/24/2019 Copyright, G. A. Tagliarini, PhD
13
Gaussian Distributions Details
2/24/2019 Copyright, G. A. Tagliarini, PhD
14
Discriminants for Normal Density
Recall the classifier functions Assuming the measurements are normally distributed, we have 2/24/2019 Copyright, G. A. Tagliarini, PhD
15
Some Algebra to Simplify the Discriminants
Since We take the natural logarithm to re-write the first term 2/24/2019 Copyright, G. A. Tagliarini, PhD
16
Some Algebra to Simplify the Discriminants (continued)
2/24/2019 Copyright, G. A. Tagliarini, PhD
17
The Discriminants (Finally!!)
2/24/2019 Copyright, G. A. Tagliarini, PhD
18
Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I 2/24/2019 Copyright, G. A. Tagliarini, PhD
19
Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I If the classes are equally likely, the discriminants depend only upon the distances to the means A diagonal covariance matrix implies the parameters are statistically independent A constant diagonal implies the class measurements have identical variability in each dimension and hence, they are spherical in d space The discriminant functions define hyperplanes orthogonal to the line segments joining the distribution means 2/24/2019 Copyright, G. A. Tagliarini, PhD
20
Copyright, G. A. Tagliarini, PhD
Special Case 1: i = 2I 2/24/2019 Copyright, G. A. Tagliarini, PhD
21
Copyright, G. A. Tagliarini, PhD
Special Case 2: i = 2/24/2019 Copyright, G. A. Tagliarini, PhD
22
Copyright, G. A. Tagliarini, PhD
Special Case 2: i = Since may possess nonzero, off-diagonal elements and varying diagonal elements the measurement distributions lie in hyper-ellipsoids The discriminant hyperplanes are often not orthogonal to the segments joining the class means 2/24/2019 Copyright, G. A. Tagliarini, PhD
23
Copyright, G. A. Tagliarini, PhD
Special Case 2: i = The quadratic term is independent of i and may be eliminated. 2/24/2019 Copyright, G. A. Tagliarini, PhD
24
Copyright, G. A. Tagliarini, PhD
Case 3: i = arbitrary This is quadratic in x The discriminant decision surfaces can arise from hyperplanes, hyperparabloids, hyperellipsoids, hyperspheres, or combinations of these!!! 2/24/2019 Copyright, G. A. Tagliarini, PhD
25
Copyright, G. A. Tagliarini, PhD
Example 2: A Problem Exemplars (transposed) For w1 = {(2, 6), (3, 4), (3, 8), (4, 6)} For w2 = {(1, -2), (3, 0), (3, -4), (5, -2)} Calculated means (transposed) m1 = (3, 6) m2 = (3, -2) 2/24/2019 Copyright, G. A. Tagliarini, PhD
26
Example 2: Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD
27
Example 2: Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD
28
Example 2: Inverse and Determinant for Each of the Covariance Matrices
2/24/2019 Copyright, G. A. Tagliarini, PhD
29
Example 2: A Discriminant Function for Class 1
2/24/2019 Copyright, G. A. Tagliarini, PhD
30
Copyright, G. A. Tagliarini, PhD
Example 2 2/24/2019 Copyright, G. A. Tagliarini, PhD
31
Example 2: A Discriminant Function for Class 2
2/24/2019 Copyright, G. A. Tagliarini, PhD
32
Copyright, G. A. Tagliarini, PhD
Example 2 2/24/2019 Copyright, G. A. Tagliarini, PhD
33
Example 2: The Class Boundary
2/24/2019 Copyright, G. A. Tagliarini, PhD
34
Example 2: A Quadratic Separator
2/24/2019 Copyright, G. A. Tagliarini, PhD
35
Example 2: Plot of the Discriminant
2/24/2019 Copyright, G. A. Tagliarini, PhD
36
Summary Steps for Building a Bayesian Classifier
Collect class exemplars Estimate class a priori probabilities Estimate class means Form covariance matrices, find the inverse and determinant for each Form the discriminant function for each class 2/24/2019 Copyright, G. A. Tagliarini, PhD
37
Copyright, G. A. Tagliarini, PhD
Using the Classifier Obtain a measurement vector x Evaluate the discriminant function gi(x) for each class i = 1,…,c Decide x is in the class j if gj(x) > gi(x) for all i j 2/24/2019 Copyright, G. A. Tagliarini, PhD
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.