Presentation is loading. Please wait.

Presentation is loading. Please wait.

se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve vi vi vi

Similar presentations


Presentation on theme: "se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve vi vi vi"— Presentation transcript:

1 se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve vi vi vi vi vi vi vi vi vi vi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A note on difficult separations: e.g., white cars from white roofs. It would make sense to include as feature attributes, the pixel coordinate value columns as well as the bands. That way, if the color is not sufficiently different to make the distinction (and no other non-visible band makes the distinction either, then because the white car training points are [likely to be] far from the white roof training points, CkNN applied to neighbors taken from the training set, should differentiate the two classes. Of course, if the white car is on top a white roof (roof top parking lot) or is parked right next to the building, then this method may not work either. What follows is the best mrk algorithms:

2 K=10 1 1 1 1 1 1 1 1 For i=4..0 { c=rc(Pc&Patt,i);
LO 1 HI 16 1 1 25 1 36 1 44 1 serc= seRK= verc= veRK= virc= viRK= seps= veps= vips= K=10 For i= { c=rc(Pc&Patt,i); if (cps){ rankK+= 2i; Pc=(Pc&Patt,i)} [rank(n-K+1)+=2i;] else { ps=ps-c; Pc=Pc&P'att,i }} 4 25 pWD_vi_LO=16  pWD_se_HI=0, pWD_ve_HI=0. So the highest pWD_se_HI and pWD_ve_HI can get is 15 and lowest pWD_vi_LO will ever be is 16. So cutting 16 will separate all vi from {se,ve}. This is, of course, with reference to the training set only and it may not carry over to the test set (much bigger set?) especially since the gap may be small (=1). Here we will use pWDcutpt16 to peal off vi! We need a theorem proof here!!! 1 16 1 25 36 44 26 25 1 16 1 25 1 36 1 44 26 25 26 24 sLN= sWD= pLN= pWD=4 16 1 25 36 44 15' 16 1 16' 1 15 1 13 1 14 1 12 1 12' 1 13' 1 14' 1 10' 1 11' 1 11 1 10 1 25' 1 25 1 21' 1 20 1 21 1 22' 1 20' 1 22 1 23' 1 23 1 24 1 24' 36 1 35' 35 1 36' 1 30 1 30' 1 34 1 33' 1 33 1 32 1 31 1 31' 1 34' 1 32' 1 41' 44 1 40 1 42 1 40' 1 41 43 1 42' 1 43' 1 44' 1 16 1 25 36 44 1 15' 1 16 1 16' 1 15 1 14 1 12 1 12' 1 13 1 14' 1 13' 1 11 1 10' 1 11' 1 10 1 25' 1 25 1 21 1 21' 1 23 1 24 1 24' 1 22 1 22' 1 20 1 20' 1 23' 1 36' 35' 1 35 36 1 30 1 30' 1 34' 1 32' 1 31' 1 31 1 33' 1 34 1 32 1 33 1 40 44 1 40' 1 41' 1 42 1 44' 1 41 1 43 1 42' 1 43' 1 36 1 16 1 25 1 44 1 16' 1 15' 1 16 1 15 1 13' 1 12 1 14' 1 14 1 12' 1 13 1 11 1 10 1 11' 1 10' 1 25' 1 25 1 21' 1 20 1 24 1 21 1 24' 1 23 1 23' 1 22 1 20' 1 22' 1 36' 1 35' 1 35 1 36 1 30' 1 30 1 33' 1 33 1 34' 1 31' 1 32 1 31 1 32' 1 34 1 44 1 40 1 41' 1 40' 1 42 1 41 1 43 44' 1 43' 1 42' 24 sLN=1 sWD=2 pLN=3 pWD=4

3 For i=4..0 { c=rc(Pc&Patt,i); if (cps){ rankK+= 2i; Pc=(Pc&Patt,i)}
10 LO 1 HI 1 16' 1 15 1 1 25 1 24 1 1 36' 35 1 1 44' 43 1 serc= seRK= verc= veRK= virc= viRK= seps= veps= vips= For i= { c=rc(Pc&Patt,i); if (cps){ rankK+= 2i; Pc=(Pc&Patt,i)} [rank(n-K+1)+=2i;] else { ps=ps-c; Pc=Pc&P'att,i }} 3 25 25 +24 1 16 1 15 1 25 1 24 1 36' 1 35 1 44' 1 43 pLN_ve_LO=32  pLN_se_HI=0. So the highest pLN_se_HI can get is 31 and lowest pLN_ve_LO will ever be is 32. So cutting 32 will separate all ve from se! Greater accuracy can be gained by continuing the process for all i and for all K then looking for the best gaps! (all gaps?) (all gaps weighted?) 26 25 +24 25 23 1 16' 1 15 1 25' 1 24 1 36' 35 1 44' 43 4 15' 1 15 1 12' 1 12 1 14 1 14' 1 13 1 13' 1 11 1 10' 1 10 1 11' 1 21 1 21' 1 20 1 22' 1 22 1 23' 1 24 1 24' 1 23 1 20' 1 35' 35 1 30 1 30' 1 31 1 33' 1 32 1 33 1 34 1 31' 1 32' 1 34' 1 41' 1 40 1 42 1 40' 1 41 43 1 42' 1 43' 25 1 16' 1 15 1 25' 1 24 1 36' 1 35 6 8 1 43 1 44' 1 15' 1 15 1 12' 1 13 1 14' 1 14 1 12 1 13' 1 11 1 11' 1 10' 1 10 1 21' 1 21 1 23 1 22' 1 24' 1 20' 1 24 1 22 1 20 1 23' 35' 1 35 1 30 1 30' 1 34' 1 32 1 33 1 34 1 31' 1 33' 1 32' 1 31 1 40 1 40' 1 41' 1 41 1 42 1 42' 1 43 1 43' 25 25

4 APPENDIX FAUST{pdq,mrk} (FAUST{pdq} w max rank_k) rank_k(S) is smallest kth largest value in S. FAUST{pdq,gap} divisive, quiet (no noise) with gaps  attr, A TA(class, md, k, cp) its attribute table ordered on md asc, where 0. attr, A TA(class, rv, gap) ord on rv asc (rv=cls rep, gap=dis to next rv. k s.t. it's max k value s.t. set_rank_k of class and set_rank_(1-k)' of the next class. (note: the rank_k for k=1/2 is median, k=1 is maximum and k=0 is the min. Same alg can clearly be used as pms FAUST{pms,mrk} WHILE RC not empty, DO 1. Find the TA record with maximum gap: 2. PA>c (c=rv+gap/2) to div RC at c into LT, GT (pTrees, PLT and PGT). 3. If LT or GT singleton {remove class) END_DO FAUST{pdq,std} (FAUST{pdq} using # of gap standard devs) 0. For each attribute, A TA(class, mn, std, n, cp) is its attribute table ordered on n asc, where cp=val in gap allowing max # of stds, n. n satisfies: mean+n*std=meanG-n*stdG so n=(mnG-mn)/(std+stdG) WHILE RC not empty, DO 1. Find the TA record with maximum n: 2. Use PA>cp to divide RC at cp=cutpoint into LT and GT (pTree masks, PLT and PGT). 3. If LT or GT singleton {remove that class from RC and from all TA's} END_DO FAUST{pms,gap} (FAUST{p} m attr cut_pts, seq class separation (1 class at time, m=1 0. For each A, TA(class, rv, gap, avgap), where avgap is avg of gap and previous_gap (if 1st avgap = gap). If x classes. DO x-1 times 1. Find the TA record with maximum avgap: 2. cL=rv-prev_gap/2. cG=rv+gap/2, masks Pclass=PA>cL&PAcG&PRC PRC=P'class&PRC (If 1st in TA (no prev_gap), Pclass=PAcG&PRC. Last, Pclass=PA>cL&PRC. 3. Remove that class from RC and from all TA's END_DO FAUST{pms,std} (FAUST{pms} using # gap std 0. attr, A TA(class, mn, std, n, avgn, cp) ordered avgn asc cp=cut_point (value in gap which allows max # of stds, n, (n satisfies: mn+n*std=mnnext-n*stdnext so n=(mnnext-mn)/(std+stdt) DO x-1 times 1. Find the TA record with maximum avgn: 2. cL=rv-prev_gap/2. cG=rv+gap/2 and pTree masks Pclass=PA>cL& PAcG&PRC PRC =P'class&PRC (If class 1st in TA (has no prev_gap), then Pclass =PAcG&PRC. If last, Pclass =PA>cL&PRC.) 3. Remove that class from RC and from all TA's END_DO

5 1. For every attr and every class, sort the values asc.
44 46 47 49 50 54 20 23 24 27 28 29 31 32 33 13 14 15 17 1 2 3 4 FAUST{pdq,mrk} algorithm, demonstrated with VPHD, Vertical Processing, Horizontal Data first : 1. For every attr and every class, sort the values asc. 2. Find and order the medians asc in TA tables. 3. Find max k s.t. rank_k_setrank_(1-k)_set =. rank_.7 rank_.7 rank_.8 rank_.9 rank_1 rank_1 rank_1 4. Proceed as in all FAUST algorithms - cut accordingly (pdq or pms or ???). With VPHD, sort each class in each attr, find medians (needed?), find rank_k_sets (combine this with sorting?) ... so O(n). With HPVD, we can avoid the sorting, find rank_k_sets (median is rank_.5), fill TAs entirely with a pTree program O(0). 49 50 52 55 57 63 64 65 66 69 rank_0 25 27 29 30 32 36 33 35 39 40 45 46 47 49 rank_0 10 13 14 15 16 rank_0 rank_.1 rank_.2 rank_.3 rank_.3 rank_.7 rank_.8 rank_.9 rank_.9 49 58 63 65 67 71 72 73 76 29 30 31 32 34 36 37 39 45 51 56 58 59 61 63 66 17 18 19 20 21 22 25 rank_.1 rank_.1 rank_.2 rank_.3 HPVD_mrk could be made optimal since we could record exactly which k and cp gives min error (as we work toward empty rank_k_set intersection) and we could know the error set. We could use CkNN or ? on each errant sample. To see this, go through the first k/cp animation. In that looping procedure it's clear we could determine se<55 with 3 errors to be the best cp (se<54, 6 errors; se<52, 5; se<50, 5; se<49, 6 ). Note: mrk above is lazy. It takes cp to be the average of the rank values - in this case cp=53 which has 6 errors. TsLN cl md k cp se ve vi 66 TsWD cl md k cp ve vi se 33 TpLN cl md k cp se ve vi 58 TpWD cl md k cp se ve vi 20 .7 53 .7 29 1.0 25 1.0 7 .7 64 .8 30 .9 49 .9 16 One can see from this animation that MaxGap is probably a pretty good method most of the time (provided there is at least one good gap each step) and the MaxGapStd is even better (same proviso). This method is intended to be optimal and to deal with, e.g., non-normal distributions.

6 maximum c=0; max=0;Pc=pure1; For i=4..0 { c=rc(Pc&Patt,i) if (c>0)
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = Ppw1 1 1 Ppw3 1 Ppw0 1 Ppw2 1 Ppw4 1 & Pc rc=10 max = 24 + 23 + 20 rc=1 rc=0 rc=1 c=0; max=0;Pc=pure1; For i=4..0 { c=rc(Pc&Patt,i) if (c>0) Pc=Pc&Patt,i max=max+2i } return max; maximum

7 minimum c=0; min=0;Pc=pure1; For i=4..0 { c=rc(Pc&P'att,i) if (c>0)
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = P'pw1 1 1 P'pw3 1 P'pw0 1 P'pw4 1 P'pw2 1 & Pc rc>0 rc=0 min = 20 c=0; min=0;Pc=pure1; For i=4..0 { c=rc(Pc&P'att,i) if (c>0) Pc=Pc&P'att,i else min=min+2i } return min; minimum

8 rank5 (5th largest) c=0; rank5=0; pos= 5; Pc=pure1;
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = 1 P'pw3 1 Ppw4 1 Ppw0 Ppw1 1 1 P'pw2 1 Ppw3 P'pw1 1 1 Ppw2 1 & Pc rc=10 rc=1 rc=1 rc=3 rc=2 rc=4 c=0; rank5=0; pos= 5; Pc=pure1; For i= //current_i = 4 { c=rc(Pc&Patt,i); if (cpos) rankK = rankK + 2i; Pc=Pc&Patt,i ; else pos = pos - c; Pc=Pc&P'att,i ; } } return rankK; 3 4 1 rankK =0 + 24 +22 1 2 3 return rank5 = 20 rank5 (5th largest)

9 rank25 (25th largest) rc=10 rc= 1 rc= 1 rc= 8 rc= 9
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = 1 P'pw4 1 Ppw3 1 P'pw3 Ppw1 1 1 P'pw2 1 Ppw0 1 Ppw2 1 Ppw4 1 & Pc P'pw1 1 rc=10 rc= 1 rc= 1 rc= 8 rc= 9 rankK =0 + 21 c=0;rank25=0; pos=25; Pc=pure1; For i= //current_i = 4 { c=rc(Pc&Patt,i); if (cpos) rankK = rankK + 2i; Pc=Pc&Patt,i ; else pos = pos - c; Pc=Pc&P'att,i ; } } return rankK; 15 5 6 3 2 1 rank25=2 rank25 (25th largest)

10 LO  all other HIs or a HI  all other LOs :
1 43' s e 43 s e 1 42 s e 1 44' s e 44 s e 1 40 s e 1 41 s e 1 41 s e 1 42 s e 44 s e 1 40 s e 1 44' s e 43 s e 1 43' Check HI and LO values in each class (over each attr., in general) for a LO  all other HIs or a HI  all other LOs : s e 44 s e 1 44' s e 43 s e 1 43' s e 1 42 s e 1 42' s e 1 41 s e 1 41' s e 1 40 s e 1 40' 1 1 LOvi=17  HIse=4 v e 1 44' v e 1 42 v e 1 40 v e 44 v e 1 43 v e 1 41 v e 1 40 v e 1 44' v e 1 42 v e 1 43 v e 1 43' v e 44 v e 1 41 LOvi=17  HIve=15 v e 44 v e 1 44' v e 1 43 v e 1 43' v e 1 42 v e 1 42' v e 1 41 v e 1 41' v e 1 40 v e 1 40' 1 1 So attr4=pedal_Width cutpoint at 16 separates vi and {se,ve}. Note: This cutpt appears early in loop (i=4). Can a gap be concluded at i=4? v i 1 40 v i 1 41 v i 1 42 v i 1 43 v i 1 44 v i 1 43' v i 1 40 v i 1 44 v i 1 42 v i 1 43 v i 1 41 v i 1 44 v i 44' v i 1 43 v i 1 43' v i 1 42 v i 1 42' v i 1 41 v i 1 41' v i 1 40 v i 1 40' 1 1 Do concurrently over all attributes for each K until 1st gap is found This finds 1st hi or low gap, but there may be none. It could find any gap pair separating 1 class from rest (change the or to and), but there may be none either. Then take best neg gap. Can be divisive. K=1 Pc n-K+1=10 Pc se1rc= ve1rc= vi1rc= se1pos= 1 ve1pos= 1 vi1pos= 1 se10rc= ve10rc= vi10rc= se10pos= 10 ve10pos= 10 vi10pos= 10 9 1 n=10,K= rankK rank(n-K+1)  att/cl, exit when class in att w same gap (hi/lo) w all other classes in att. Peal cls Rept. 1 1 8 2 4 7 9 3 1 9 1 1 10 1 5 4 1 10 1 9 5 For i=4..0 { c=rc(Pc&Patt,i); if(cpos){rankK=+=2i; Pc=Pc&Patt,i} [rank(n-K+1)+=2i;] else {pos=pos-c; Pc=Pc&P'att,i} }return 1 2 3 4 HI se1rnk=0 ve1rnk=0 vi1rnk=0 LO se10rnk=0 ve10rnk=0 vi10rnk=0 22 20 23 +22 +21 +20 20 +24 +23 +20 +24 +21


Download ppt "se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve vi vi vi"

Similar presentations


Ads by Google