Presentation is loading. Please wait.

Presentation is loading. Please wait.

FAUST{pms,std} (FAUST{pms} using # gap std

Similar presentations


Presentation on theme: "FAUST{pms,std} (FAUST{pms} using # gap std"— Presentation transcript:

1 FAUST{pms,std} (FAUST{pms} using # gap std
FAUST{pdq,mrk} (FAUST{pdq} w max rank_k) 0k1 rank_k(S) is smallest vS s.t. k fraction of S v. (except for k=0 then we use <v) RC=Remaining_Classes (initially all classes) with pTree, PRC (initially pure1). FAUST{pdq,gap} (FAUST{p} divisive, quiet (no noise) using gaps 0. attr, A TA(class, rv, gap) ordered on rv asc (rv is class rep val, gap=dist to next rv.  attr, A TA(class, md, k, cp) its attribute table ordered on md asc, where k s.t. it's max k value s.t. set_rank_k of class and set_rank_(1-k)' of the next class. (note: the rank_k for k=1/2 is median, k=1 is maximum and k=0 is the min. Same algorithm can clearly be used as a pms, that is; FAUST{pms,mrk} WHILE RC not empty, DO 1. Find the TA record with maximum gap: 2. Use PA>c (c=rv+gap/2) to divide RC at c into LT, GT (pTrees, PLT and PGT). 3. If LT or GT singleton {remove that class from RC and from all TA's END_DO FAUST{pdq,std} (FAUST{pdq} using # of gap standard devs) 0. For each attribute, A TA(class, mn, std, n, cp) is its attribute table ordered on n asc, where cp=val in gap allowing max # of stds, n. n satisfies: mean+n*std=meanG-n*stdG so n=(mnG-mn)/(std+stdG) WHILE RC not empty, DO 1. Find the TA record with maximum n: 2. Use PA>cp to divide RC at cp=cutpoint into LT and GT (pTree masks, PLT and PGT). 3. If LT or GT singleton {remove that class from RC and from all TA's} END_DO FAUST{pms,gap} (FAUST{p} m attr cut_pts, seq class separation (1 class at time, m=1 0. For each A, TA(class, rv, gap, avgap), where avgap is avg of gap and previous_gap (if 1st avgap = gap). If x classes. DO x-1 times 1. Find the TA record with maximum avgap: 2. cL=rv-prev_gap/2. cG=rv+gap/2, masks Pclass=PA>cL&PAcG&PRC PRC=P'class&PRC (If 1st in TA (no prev_gap), Pclass=PAcG&PRC. Last, Pclass=PA>cL&PRC. 3. Remove that class from RC and from all TA's END_DO FAUST{pms,std} (FAUST{pms} using # gap std 0. attr, A TA(class, mn, std, n, avgn, cp) ordered avgn asc cp=cut_point (value in gap which allows max # of stds, n, (n satisfies: mn+n*std=mnnext-n*stdnext so n=(mnnext-mn)/(std+stdt) DO x-1 times 1. Find the TA record with maximum avgn: 2. cL=rv-prev_gap/2. cG=rv+gap/2 and pTree masks Pclass=PA>cL& PAcG&PRC PRC =P'class&PRC (If class 1st in TA (has no prev_gap), then Pclass =PAcG&PRC. If last, Pclass =PA>cL&PRC.) 3. Remove that class from RC and from all TA's END_DO

2 1. For every attr and every class, sort the values asc.
44 46 47 49 50 54 20 23 24 27 28 29 31 32 33 13 14 15 17 1 2 3 4 FAUST{pdq,mrk} algorithm, demonstrated with VPHD, Vertical Processing, Horizontal Data first : 1. For every attr and every class, sort the values asc. 2. Find and order the medians asc in TA tables. 3. Find max k s.t. rank_k_setrank_(1-k)_set =. rank_.7 rank_.7 rank_.8 rank_.9 rank_1 rank_1 rank_1 4. Proceed as in all FAUST algorithms - cut accordingly (pdq or pms or ???). With VPHD, sort each class in each attr, find medians (needed?), find rank_k_sets (combine this with sorting?) ... so O(n). With HPVD, we can avoid the sorting, find rank_k_sets (median is rank_.5), fill TAs entirely with a pTree program O(0). 49 50 52 55 57 63 64 65 66 69 rank_0 25 27 29 30 32 36 33 35 39 40 45 46 47 49 rank_0 10 13 14 15 16 rank_0 rank_.1 rank_.2 rank_.3 rank_.3 rank_.7 rank_.8 rank_.9 rank_.9 49 58 63 65 67 71 72 73 76 29 30 31 32 34 36 37 39 45 51 56 58 59 61 63 66 17 18 19 20 21 22 25 rank_.1 rank_.1 rank_.2 rank_.3 HPVD_mrk could be made optimal since we could record exactly which k and cp gives min error (as we work toward empty rank_k_set intersection) and we could know the error set. We could use CkNN or ? on each errant sample. To see this, go through the first k/cp animation. In that looping procedure it's clear we could determine se<55 with 3 errors to be the best cp (se<54, 6 errors; se<52, 5; se<50, 5; se<49, 6 ). Note: mrk above is lazy. It takes cp to be the average of the rank values - in this case cp=53 which has 6 errors. TsLN cl md k cp se ve vi 66 TsWD cl md k cp ve vi se 33 TpLN cl md k cp se ve vi 58 TpWD cl md k cp se ve vi 20 .7 53 .7 29 1.0 25 1.0 7 .7 64 .8 30 .9 49 .9 16 One can see from this animation that MaxGap is probably a pretty good method most of the time (provided there is at least one good gap each step) and the MaxGapStd is even better (same proviso). This method is intended to be optimal and to deal with, e.g., non-normal distributions.

3 maximum c=0; max=0;Pc=pure1; For i=4..0 { c=rc(Pc&Patt,i) if (c>0)
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = Ppw1 1 1 Ppw3 1 Ppw0 1 Ppw2 1 Ppw4 1 & Pc rc=10 max = 24 + 23 + 20 rc=1 rc=0 rc=1 c=0; max=0;Pc=pure1; For i=4..0 { c=rc(Pc&Patt,i) if (c>0) Pc=Pc&Patt,i max=max+2i } return max; maximum

4 minimum c=0; min=0;Pc=pure1; For i=4..0 { c=rc(Pc&P'att,i) if (c>0)
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = P'pw1 1 1 P'pw3 1 P'pw0 1 P'pw4 1 P'pw2 1 & Pc rc>0 rc=0 min = 20 c=0; min=0;Pc=pure1; For i=4..0 { c=rc(Pc&P'att,i) if (c>0) Pc=Pc&P'att,i else min=min+2i } return min; minimum

5 rank5 (5th largest) c=0; rank5=0; pos= 5; Pc=pure1;
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = 1 P'pw3 1 Ppw4 1 Ppw0 Ppw1 1 1 P'pw2 1 Ppw3 P'pw1 1 1 Ppw2 1 & Pc rc=10 rc=1 rc=1 rc=3 rc=2 rc=4 c=0; rank5=0; pos= 5; Pc=pure1; For i= //current_i = 4 { c=rc(Pc&Patt,i); if (cpos) rankK = rankK + 2i; Pc=Pc&Patt,i ; else pos = pos - c; Pc=Pc&P'att,i ; } } return rankK; 3 4 1 rankK =0 + 24 +22 1 2 3 return rank5 = 20 rank5 (5th largest)

6 rank25 (25th largest) rc=10 rc= 1 rc= 1 rc= 8 rc= 9
se se se se se se se se se se se se se se se se se se se se se se se se se se se se se se 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = 1 P'pw4 1 Ppw3 1 P'pw3 Ppw1 1 1 P'pw2 1 Ppw0 1 Ppw2 1 Ppw4 1 & Pc P'pw1 1 rc=10 rc= 1 rc= 1 rc= 8 rc= 9 rankK =0 + 21 c=0;rank25=0; pos=25; Pc=pure1; For i= //current_i = 4 { c=rc(Pc&Patt,i); if (cpos) rankK = rankK + 2i; Pc=Pc&Patt,i ; else pos = pos - c; Pc=Pc&P'att,i ; } } return rankK; 15 5 6 3 2 1 rank25=2 rank25 (25th largest)

7 P'pw4 P'pw3 P'pw2 Pc = Ppw3 Ppw2 Ppw1 Ppw0 Ppw4 & Pc 15 3 rc=10
se se se se se se se se se se ve ve ve ve ve ve ve ve ve ve vi vi vi vi vi vi vi vi vi vi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Pc = 1 P'pw4 1 Ppw3 1 Ppw2 1 P'pw3 Ppw1 1 1 Ppw0 1 P'pw2 1 Ppw4 1 & Pc 15 3 rc=10 For K=1,2,... calc rankK and rank(1-K)  class and  attribute. Exit when 1st there appears a class in an attr which has a same-side gap (hi or lo) with every other class in that attribute. Then peal off that class, Repeat until done. c=0;rankK=0;p=10; Pc=pure1; For i=4..0 //current_i= 4 { c=rc(Pc&Patt,i); if (cp) rankK = rankK + 2i; Pc=Pc&Patt,i ; else p = p - c; Pc=Pc&P'att,i ; } } return c=0;rank(1-K)=0;p=1; Pc=pure1; For i=4..0 //current_i= 4 { c=rc(Pc&Patt,i); if (cp) rankK = rankK + 2i; Pc=Pc&Patt,i ; else p = p - c; Pc=Pc&P'att,i ; } } return rankK =21 rankK =21

8 For i=4..0 { c=rc(Pc&Patt,i); if(cpos) { rankK=+=2i;
1 Ppw3 1 P'pw3 1 P'pw4 1 Ppw4 1 P'pw4 1 Ppw3 1 P'pw3 1 Ppw2 1 Ppw4 1 P'pw3 1 Ppw2 1 Ppw3 1 Ppw4 1 Ppw3 1 P'pw4 1 Ppw4 1 Ppw2 1 P'pw3 1 Ppw2 1 Ppw2 1 P'pw4 1 Ppw3 1 Ppw4 1 Ppw2 1 Ppw3 1 Ppw4 Ppw1 1 1 Ppw0 1 P'pw2 P'pw1 1 1 1 1 1 1 1 K Pc K Pc K Pc n-K Pc n-K Pc n-K Pc n=10,K= calc rankK [rank(n-K+1)]  att/cl, exit when a class in att w same-side gap (hi/lo) w all other classes in att. Peal cls Rept. seKrc= veKrc= viKrc= 10 se_posK= 10 ve_posK= 10 vi_posK= 10 se(n-K+1)rc= ve(n-K+1)rc= vi(n-K+1)rc= se_pos(n-K+1)= 1 ve_pos(n-K+1)= 1 vi_pos(n-K+1)= 1 1 9 9 1 1 4 For i= { c=rc(Pc&Patt,i); if(cpos) { rankK=+=2i; [rank(n-K+1)+=2i;] Pc=Pc&Patt,i; else pos=pos-c; Pc=Pc&P'att,i }}return 3 2 4 LO se_rnkK=0 ve_rnkK=0 vi_rnkK=0 HI se_rnk(n-K+1)=0 ve_rnk(n-K+1)=0 vi_rnk(n-K+1)=0 +23 +24 +24 +23 Look for first LO higher than other HI's or first Hi lower than other LO's

9 For i=4..0 { c=rc(Pc&Patt,i); if(cpos){rankK=+=2i; Pc=Pc&Patt,i}
e 1 44 s e 1 42 s e 43 s e 43 s e 1 44 s e 1 44' s e 1 42 s e 1 44 s e 1 44' s e 43 s e 1 43' s e 1 42 s e 1 42' s e 1 41 s e 1 41' s e 1 40 s e 1 40' 1 1 1 1 1 1 1 v e 1 44 v e 1 42 v e 1 43 v e 1 44' v e 1 44 v e 1 43 v e 1 42 v e 1 44 v e 1 44' v e 1 43 v e 1 43' v e 1 42 v e 1 42' v e 1 41 v e 1 41' v e 1 40 v e 1 40' 1 1 v i 1 44 v i 1 42 v i 1 43 v i 1 43 v i 1 44 v i 1 44' v i 1 42 v i 1 44 v i 1 44' v i 1 43 v i 1 43' v i 1 42 v i 1 42' v i 1 41 v i 1 41' v i 1 40 v i 1 40' 1 1 K=1 Pc n-K+1=10 Pc se1rc= ve1rc= vi1rc= se1pos= 1 ve1pos= 1 vi1pos= 1 se10rc= ve10rc= vi10rc= se10pos= 10 ve10pos= 10 vi10pos= 10 6 n=10,K= rankK rank(n-K+1)  att/cl, exit when class in att w same gap (hi/lo) w all other classes in att. Peal cls Rept. 6 4 1 7 3 3 6 7 3 4 6 2 4 1 6 For i=4..0 { c=rc(Pc&Patt,i); if(cpos){rankK=+=2i; Pc=Pc&Patt,i} [rank(n-K+1)+=2i;] else {pos=pos-c; Pc=Pc&P'att,i} }return 2 4 3 HI se1rnk=0 ve1rnk=0 vi1rnk=0 +24 +22 LO se10rnk=0 ve10rnk=0 vi10rnk=0 +24 +23 +22 +23 +22 +24 +23

10 For i=4..0 { c=rc(Pc&Patt,i); if(cpos){rankK=+=2i; Pc=Pc&Patt,i}
e 1 43' s e 43 s e 1 44' s e 44 s e 1 42 s e 1 41 s e 1 40 s e 1 44' s e 1 41 s e 1 43' s e 1 40 s e 1 42 s e 43 s e 44 s e 44 s e 1 44' s e 43 s e 1 43' s e 1 42 s e 1 42' s e 1 41 s e 1 41' s e 1 40 s e 1 40' 1 1 v e 1 44' v e 1 43 v e 1 42 v e 1 40 v e 1 41 v e 44 v e 1 40 v e 1 44' v e 1 41 v e 1 43 v e 1 42 v e 44 v e 1 43' v e 44 v e 1 44' v e 1 43 v e 1 43' v e 1 42 v e 1 42' v e 1 41 v e 1 41' v e 1 40 v e 1 40' 1 1 v i 1 44 v i 1 43 v i 1 42 v i 1 41 v i 1 40 v i 1 43' v i 1 40 v i 1 41 v i 1 43 v i 1 42 v i 1 44 v i 1 44 v i 44' v i 1 43 v i 1 43' v i 1 42 v i 1 42' v i 1 41 v i 1 41' v i 1 40 v i 1 40' 1 1 K=1 Pc n-K+1=10 Pc se1rc= ve1rc= vi1rc= se1pos= 1 ve1pos= 1 vi1pos= 1 se10rc= ve10rc= vi10rc= se10pos= 10 ve10pos= 10 vi10pos= 10 1 9 1 n=10,K= rankK rank(n-K+1)  att/cl, exit when class in att w same gap (hi/lo) w all other classes in att. Peal cls Rept. 1 8 2 9 3 7 4 9 1 1 1 1 5 4 1 10 1 9 10 5 For i=4..0 { c=rc(Pc&Patt,i); if(cpos){rankK=+=2i; Pc=Pc&Patt,i} [rank(n-K+1)+=2i;] else {pos=pos-c; Pc=Pc&P'att,i} }return 1 2 3 4 HI se1rnk=0 ve1rnk=0 vi1rnk=0 LO se10rnk=0 ve10rnk=0 vi10rnk=0 22 20 23 +22 +21 +20 20 +24 +23 +20 +24 +21


Download ppt "FAUST{pms,std} (FAUST{pms} using # gap std"

Similar presentations


Ads by Google