Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn)

Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn) which we think, based on the training points, is most likely to contain that class and least likely to contain the other classes. We could take the bbn's to be "multi-coordinate_band" or mcb, of the form, the INTERSECTION of the "best" k (k  n-1, assuming n classes ) cb's for a given class (where "best" can be with respect to any of the above maximizations). And instead of using a fixed number of coordinates, k, we could use only those coordinates in which the "quality" of its cb is higher than a threshold, where "quality" might be measured many ways involving the dimensions of the gaps (or other ways?). Many pixels may not get classified (this hypothesis needs testing!). It should be accurate though. In the current FAUST, each bbn is a coordinate box, i.e., for coordinate (band) R, coordinate_box cb(R,class,a R,b R ) is the set of all points, x, such that a R < x R < b R (either of a R or b R can be infinite). Either or both of the < can be . The values, a R and b R are what we have called the cut_points for that class. bbn's are constructed using the training set and applied to the full set of unclassified pixels. The bbn's are always applied sequentially, but can be constructed either sequentially or divisively. In case the construction is sequential, the application sequence is the same as the construction sequence (and the application for each class, follows the construction for that class immediately. i.e., before the next bbn construction): All pixels in the first bbn are classified into that first class (the class of that bbn). All remaining pixels which are in the second bbn are classified into the second class (class of that bbn), etc. Thus, iteratively, all remaining unclassified pixels which are in the next bbn are classified into its class. The reason cn's are applied sequentially is that they intersect. Thus, the first bbn should be the strongest in some sense, then the next strongest, then the next strongest, etc. In each round, from the remaining classes, we construct FAUST cn's by choosing the attribute-class with the maximum gap_between_consecutive_mean_values, or the maximum_number_of_stds_between_consecutive_means or the gap_between_consecutive_means allowing the minimum rank (i.e., the "best remaining gap"). Note that mean can be replaced by median or any representer. R aRaR bRbR R aRaR bRbR aGaG bGbG G

Near Neighbor Classifiers and FAUST-2 We note that mcb's are used for vegetation indexing: high green ( a G high and b G = , i.e., all x such that x G > a G ) and low red ( a R = -  and b R low, i.e., all x such that x R < b R ) is the standard "vegetation index" and measures crop health well. So, if in instead of predicting grass if we were predicting lush grass, we could use vi, which involves mcb bbn's Similarly mcb bbn's would be used for any color object which is not pure (in the bands provided). Therefore a "blue-red" car would ideally involve a bbn that is the intersection of a red cn and a blue cn. Most paint colors are not pure. Worse yet, what does pure mean? Pure only makes sense in the context of the camera taking the image in the first place. The definition of a pure color in a given image is a color entirely within one band (column) of that image dataset (with all other bands showing zero values only). So almost all actual objects would be multi-color objects and would require, or at least benefit from, a multi-cn bbn approach. R G B

se 49 30 14 2 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 se 47 32 13 2 0 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 se 46 31 15 2 0 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 0 se 54 36 14 2 0 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 1 1 1 0 0 0 0 1 0 se 54 39 17 4 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 0 0 1 0 0 se 46 34 14 3 0 1 0 1 1 1 0 1 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 1 1 se 50 34 15 2 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 0 se 44 29 14 2 0 1 0 1 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 0 0 0 1 0 se 49 31 15 1 0 1 1 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 se 54 37 15 2 0 1 1 0 1 1 0 1 0 0 1 0 1 0 0 0 1 1 1 1 0 0 0 1 0 ve 64 32 45 15 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 1 0 1 1 1 1 ve 69 31 49 15 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 0 0 1 0 1 1 1 1 ve 55 23 40 13 0 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 ve 65 28 46 15 1 0 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 0 0 1 1 1 1 ve 57 28 45 13 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 1 ve 63 33 47 1 0 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 1 1 0 0 0 0 1 ve 49 24 33 10 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 ve 66 29 46 13 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 ve 52 27 39 14 0 1 1 0 1 0 0 0 1 1 0 1 1 0 1 0 0 1 1 1 0 1 1 1 0 ve 50 20 35 10 0 1 1 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 1 0 1 0 1 0 vi 58 27 51 19 0 1 1 1 0 1 0 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 1 vi 71 30 59 21 1 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 0 1 0 1 vi 63 29 56 18 0 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 vi 65 30 58 22 1 0 0 0 0 0 1 0 1 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 0 vi 76 30 66 21 1 0 0 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 1 0 1 0 1 0 1 vi 49 25 45 17 0 1 1 0 0 0 1 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 0 1 vi 73 29 63 18 1 0 0 1 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 1 1 0 0 1 0 vi 67 25 58 18 1 0 0 0 0 1 1 0 1 1 0 0 1 0 1 1 1 0 1 0 1 0 0 1 0 vi 72 36 61 25 1 0 0 0 1 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 vi 65 32 51 20 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 000000000000000000001111111111000000000000000000001111111111 000000000011111011110000000010000000000011111011110000000010 111101110111010010111011001100111101110111010010111011001100 000010000011111001100101100001000010000011111001100101100001 000001001011111101001100110010000001001011111101001100110010 111101111110011101100000011010111101111110011101100000011010 011010101111001110111100011011011010101111001110111100011011 111101111110111101000111011110111101111110111101000111011110 000010000001000000001111001111000010000001000000001111001111 000000000011111111111111011111000000000011111111111111011111 000000000000000000000000100000000000000000000000000000100000 010111100110000100000000000011010111100110000100000000000011 101000011001111011111111111100101000011001111011111111111100 101000011001011011101111111100101000011001011011101111111100 101110011101111001010111101010101110011101111001010111101010 101011101001100000101101100000101011101001100000101101100000 001010011101100101101010011100001010011101100101101010011100 000000000011010001000101101111000000000011010001000101101111 111111111100101110111010010000111111111100101110111010010000 100110101100101110111010010000100110101100101110111010010000 011111010101100100100110100010011111010101100100100110100010 011111100100100101011110000100011111100100100101011110000100 110000001001111110000111011101110000001001111110000111011101 011001010000001100001010101000011001010000001100001010101000 101101111100010101111101101101101101111100010101111101101101 Note on problems: Difficult separations problem: e.g., white cars from white roofs. Include as feature attributes, the pixel coordinate value columns as well as the bands. If the color is not sufficiently different to make the distinction (and no other non-visible band makes the distinction either) and if the classess are contiguous objects (as they are in Aroura), then because the white car training points are [likely to be] far from the white roof training points, FAUST may still work well, using x and y pixel coordinates as additional feature attributes (and attributes such as "shape", edge_sharpness, etc., if available). CkNN applied to nbrs taken from the training set, should work also. Using FAUST{seq}, where we maximize the: 1. size of gap between consecutive means or 2. maximize the number of std s in the gap between consecutive means or 3. minimize the K which produces no overlap (betweeen the rankK set and the rank(n-K+1) set of the next class) in the gap between consecutive classes instead taking as cut_point, the point produced by that maximization, we should back off from that and narrow the interval around that class mean by going only a fraction either way (some parameterized fraction), which would remove many of the NC points from that class prediction. Noise Class Problem: In pixel classification, there's may be a Default_Class or NOise, N) (Aurora classes are Red_Cars, White_Cars, Black_Cars, ASphalt, White_Roof, GRass, SHadow and in the "Parking Lot Scene" case at least, there does not appear to be a NOise_class - i.e., every pixel is in one of the 7 classes above). So, in some cases, we may have 8 classes {RC, WC, BC, AS, WR, GR, SH, NO}. Picking out NO may be a challenge for any algorithm if it contains pixels that match training pixels from several of the legitimate classes - i.e., if NO is composed of tuples with values similar to other classes (Dr. Wettstein calls this the "red shirt" problem - if a person has a red shirt and is in the field of view, those pixels may be electromagnetically indistinguishable from Red_Car pixels. In that case, no correct algorithm will distinguish them electromagnetically (using only reflectance bands). Such other attributes as x and y position, size and shape (if available) etc. may provide a distinction. Inconsistent ordering of classes over the various attributes (columns) may be an indicator of something? Appendix

T sLN cl md K rnK gap se 50 12 52 1 no 57 12 51 1 ve 60 T sWD cl md K rnK gap ve 28 15 29 1 no 30 12 31 1 se T pLN cl md K rnK gap se 15 10 19 3 no 42 17 41 2 ve 44 T pWD cl md K rnK gap se 2 7 4 2 no 12 16 12 1 ve 14 Build ACS tables (gap>0). cut_pt=rankK+S*(gap), S=1. Minimize K. 1. Sort ACS's asc by median gap=rankK(this class)-rank(n-K+1)(next class) 2. Do Until ( rankK(ACS)  rank(n-K+1)(next higher ACS in same A) | K=n/2 ) 3. Find gap, except K th 4. K=K-1; END DO; return K for each Att, Class pair. An old version of the basic alg. I took the first 40 of setosa, versicolor and virginica and put the other 30 tuples in a class called "noise". 1 st pass produces a tie for min K, in (pLN, vi) and (pWD, vi) (note: in both vi doesn't have higher gap since it's highest). Thus we can take both - either AND the conditions or OR the conditions. If we OR the conditions ( P pLN,vi  48) | (P pWD,vi  16) get perfect classification [and if AND get 5 mistakes]: T sLN cl md K rnK gap se 50 12 52 1 no 57 12 51 1 ve 60 15 62 1 vi 64 T sWD cl md K rnK gap ve 28 20 29 230 vi 30 10 28 1 no 30 12 31 1 se T pLN cl md K rnK gap se 15 10 19 3 no 42 17 41 2 ve 44 5 48 1 vi 56 T pWD cl md K rnK gap se 2 7 4 2 no 12 16 12 1 ve 14 5 16 1 vi 20 recompute min K in (pWD, vi). P pWD,vi  5 get 9 mistakes. T sLN cl md K rnK gap no 57 12 51 1 ve 60 T sWD cl md K rnK gap ve 28 15 29 1 no 30 T pLN cl md K rnK gap no 42 17 41 2 ve 44 T pWD cl md K rnK gap no 12 16 12 1 ve 14 min K in (sLN, no). P pWD,vi  51 get 12 mistakes. FAUST{seq,mrk} VPHD Set of training values in 1 col and 1 class called Attribute- Class-Set, ACS. K(ACS)=|ACS| (all |ACS|=n=10 here). In the alg below, c=root_count and ps=position (there's a separate root_count and position for each ACS and each of K and n-K+1 for that ACS. So c=c( attr, class, K|(n-K+1) ). S=gap enlargement parameter (It can be djusted to try to clip out Noise Class, NC) 1. Sort ACS's asc by median gap = rankK(this class) - rank(n-K+1)(next class) 2. Do Until ( rankK(ACS)  rank(n-K+1)(next ACS) | K=0 ) 3. Find rankK and rank(n-K+1) values of each ACS (except 1 st an and K th ) 4. K=K-1; END DO; return K for each Attribute, Class pair. 5. Cut_pts placed above/below that class (using values in attr): hi cut_pt=rankK+S*(higher_gap) low cut_pt=rank(n-K+1)S*(lower_gap) 49 47 46 54 46 50 44 49 54 30 32 31 36 39 34 29 31 37 14 13 15 14 17 14 15 14 15 22224322122222432212 64 69 55 65 57 63 49 66 52 50 32 31 23 28 33 24 29 27 20 45 49 40 46 45 47 33 46 39 35 15 13 15 13 1 10 13 14 10 58 71 63 65 76 49 73 67 72 65 27 30 29 30 25 29 25 36 32 51 59 56 58 66 45 63 58 61 51 19 21 18 22 21 17 18 25 20 se ve vi sLN 1 sW D 2 pL N 3 pW D 4 attributes or columns classes Attr-Class-Set, ACS(sWD, vi)

Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn)

Similar presentations

Presentation on theme: "Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn)

Similar presentations

Presentation on theme: "Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn)"— Presentation transcript:

Similar presentations

About project

Feedback